Optimizing character search with bitwise operations

In the previous post about the 1 Billion Row Challenge, I used bitwise operations to find a separator in the string. It’s time to describe it in more detail. This is a source code:

func findPosition(word uint64, symbol byte, offset int) (int, bool) {
	var mask uint64 = 0x101010101010101 * uint64(symbol)
	xorResult := word ^ mask
	found := (xorResult - 0x0101010101010101) &^ xorResult & 0x8080808080808080
	if found == 0 {
		return 0, false
	}

	result := ((((found - 1) & 0x101010101010101) * 0x101010101010101) >> 56) - 1

	return int(result) + offset, true
}

Suppose we want to find ; in the string smth;9.9. Let’s analyze the algorithm in steps.

Step 1. Form and apply a bitmask

We’ll first convert the string data into a uint64 value and create a mask using the ASCII value of the semicolon character:

data := []byte("smth;9.9") // = [73 6d 74 68 3b 39 2e 39]

word := *(*uint64)(unsafe.Pointer(&data[0])) //= 0x392e393b68746d73, bytes in the reverse order because of little-endian

var mask uint64 = 0x101010101010101 * uint64(symbol) // = 0x101010101010101 * 0x3b = 0x3b3b3b3b3b3b3b3b

xorResult := word ^ mask // = 0x2150200534f5648

The XOR operation for two bits returns zero if both are equal. The ASCII code for the semicolon is 0x3b, therefore if we apply the mask with 0x3b3b3b3b3b3b3b3b then we receive an empty byte in the place of the semicolon. Otherwise, it will be a non-zero value.

Step 2. Checking for an empty byte

The approach is described in Bit Twiddling Hacks and suggested by Alan Mycroft in 1987. Let’s split the next line into suboperations:

tmp1 := (xorResult - 0x0101010101010101) // = 0x11400ff524e5547

With the expression above, we set the most significant bit (MSb) to 1 for any byte that is either zero (0x00) or has a value greater than 0x81.

Also, we invert each bit in the xorResult:

tmp2 := ^xorResult // = 0xfdeafdffacb0a9b7

For the bytes from the set [0x00, 0x01, 0x02, ..., 0x7f] the result will be [0xff, 0xfe, 0xfd, ..., 0x80].

There is only one case when the MSb will be set in tmp1 and tmp2: if the initial byte was 0x00. Therefore, after the operation below, only the byte that originally held the semicolon character will have its MSb bit set:

tmp3 := tmp1 & tmp2 // = 0x10000ff00000107

Now, we want to unset all bits except the MSb. We apply the mask 0x8080808080808080 with the MSb bit set:

found := tmp3 & 0x8080808080808080  // = 0x8000000000

This way, if the most significant bit of any byte is set in the found variable, the string contains the desired character.

Step 3. Finding the position of the character

Now we need to find the index of the byte with the MSb set (let’s call this byte “target”). Again, we use the modified trick from Bit Twiddling Hacks. First of all, we do subtraction:

// found = 0x8000000000 = 00000000 00000000 00000000 10000000 00000000 00000000 00000000 00000000

tmp := found - 1 // = 0x7fffffffff == 00000000 00000000 00000000 01111111 11111111 11111111 11111111 11111111

All bits after the MSb were set. After that, we apply the mask 0x101010101010101 to unset all bits except the rightest (or least significant, LSb):

tmp2 := tmp & 0x101010101010101 // = 0x101010101

If we count the amount of not empty bits, we’ll receive the position of the target byte. For that, we can use the next trick:

tmp3 := tmp2 * 0x101010101010101
//   0x101010101       = 00000000 00000000 00000000 00000001 00000001 00000001 00000001 00000001
// * 0x101010101010101 = 00000001 00000001 00000001 00000001 00000001 00000001 00000001 00000001
// = 0x505050504030201 = 00000101 00000101 00000101 00000101 00000100 00000011 00000010 00000001

Multiplication by 0x101010101010101 equals this operation: tmp3 := tmp2 + (tmp2 << 8) + (tmp2 << 16) + ... + (x << 56) and has one useful property: it calculates the prefix sum for each byte. You can see the pattern even in the hexadecimal value 0x505050504030201: 5 5 5 5 4 3 2 1. Now, we can just isolate the top byte and subtract one to receive the proper offset of the target byte:

result := (tmp4 >> 56) - 1 // = 4

This way we found the position of the semicolon. You can easily modify the mask in step 1 and find the position of any other symbol.

Conclusion

Bitwise operations can significantly boost your code’s performance but at the cost of reduced readability. It’s essential to weigh these factors when selecting an algorithm. Additionally, modern compilers can optimize certain operations independently, so profiling your code before and after implementing bitwise operations is crucial to verify actual performance gains.

In scenarios where data processing speed is critical and a slight decrease in readability is acceptable, bitwise operations can be justified. However, for most tasks, standard functions from stirng or bytes packages are enough.

More information about bitwise tricks you can find here: