4 minutes
Optimizing character search with bitwise operations
In the previous post about the 1 Billion Row Challenge, I used bitwise operations to find a separator in the string. It’s time to describe it in more detail. This is a source code:
func findPosition(word uint64, symbol byte, offset int) (int, bool) {
var mask uint64 = 0x101010101010101 * uint64(symbol)
xorResult := word ^ mask
found := (xorResult - 0x0101010101010101) &^ xorResult & 0x8080808080808080
if found == 0 {
return 0, false
}
result := ((((found - 1) & 0x101010101010101) * 0x101010101010101) >> 56) - 1
return int(result) + offset, true
}
Suppose we want to find ;
in the string smth;9.9
. Let’s analyze the algorithm in steps.
Step 1. Form and apply a bitmask
We’ll first convert the string data into a uint64
value and create a mask using the ASCII value of the semicolon character:
data := []byte("smth;9.9") // = [73 6d 74 68 3b 39 2e 39]
word := *(*uint64)(unsafe.Pointer(&data[0])) //= 0x392e393b68746d73, bytes in the reverse order because of little-endian
var mask uint64 = 0x101010101010101 * uint64(symbol) // = 0x101010101010101 * 0x3b = 0x3b3b3b3b3b3b3b3b
xorResult := word ^ mask // = 0x2150200534f5648
The XOR
operation for two bits returns zero if both are equal. The ASCII code for the semicolon is 0x3b
, therefore if we apply the mask with 0x3b3b3b3b3b3b3b3b
then we receive an empty byte in the place of the semicolon. Otherwise, it will be a non-zero value.
Step 2. Checking for an empty byte
The approach is described in Bit Twiddling Hacks and suggested by Alan Mycroft in 1987. Let’s split the next line into suboperations:
tmp1 := (xorResult - 0x0101010101010101) // = 0x11400ff524e5547
With the expression above, we set the most significant bit (MSb) to 1 for any byte that is either zero (0x00) or has a value greater than 0x81.
Also, we invert each bit in the xorResult
:
tmp2 := ^xorResult // = 0xfdeafdffacb0a9b7
For the bytes from the set [0x00, 0x01, 0x02, ..., 0x7f]
the result will be [0xff, 0xfe, 0xfd, ..., 0x80]
.
There is only one case when the MSb will be set in tmp1
and tmp2
: if the initial byte was 0x00. Therefore, after the operation below,
only the byte that originally held the semicolon character will have its MSb bit set:
tmp3 := tmp1 & tmp2 // = 0x10000ff00000107
Now, we want to unset all bits except the MSb. We apply the mask 0x8080808080808080
with the MSb bit set:
found := tmp3 & 0x8080808080808080 // = 0x8000000000
This way, if the most significant bit of any byte is set in the found
variable, the string contains the desired character.
Step 3. Finding the position of the character
Now we need to find the index of the byte with the MSb set (let’s call this byte “target”). Again, we use the modified trick from Bit Twiddling Hacks. First of all, we do subtraction:
// found = 0x8000000000 = 00000000 00000000 00000000 10000000 00000000 00000000 00000000 00000000
tmp := found - 1 // = 0x7fffffffff == 00000000 00000000 00000000 01111111 11111111 11111111 11111111 11111111
All bits after the MSb were set. After that, we apply the mask 0x101010101010101
to unset all bits except the rightest (or least significant, LSb):
tmp2 := tmp & 0x101010101010101 // = 0x101010101
If we count the amount of not empty bits, we’ll receive the position of the target byte. For that, we can use the next trick:
tmp3 := tmp2 * 0x101010101010101
// 0x101010101 = 00000000 00000000 00000000 00000001 00000001 00000001 00000001 00000001
// * 0x101010101010101 = 00000001 00000001 00000001 00000001 00000001 00000001 00000001 00000001
// = 0x505050504030201 = 00000101 00000101 00000101 00000101 00000100 00000011 00000010 00000001
Multiplication by 0x101010101010101
equals this operation: tmp3 := tmp2 + (tmp2 << 8) + (tmp2 << 16) + ... + (x << 56)
and
has one useful property: it calculates the prefix sum for each byte. You can see the pattern even in the hexadecimal value 0x505050504030201
: 5 5 5 5 4 3 2 1
. Now, we can just isolate the top byte and subtract one to receive the proper offset of the target byte:
result := (tmp4 >> 56) - 1 // = 4
This way we found the position of the semicolon. You can easily modify the mask in step 1 and find the position of any other symbol.
Conclusion
Bitwise operations can significantly boost your code’s performance but at the cost of reduced readability. It’s essential to weigh these factors when selecting an algorithm. Additionally, modern compilers can optimize certain operations independently, so profiling your code before and after implementing bitwise operations is crucial to verify actual performance gains.
In scenarios where data processing speed is critical and a slight decrease in readability is acceptable, bitwise operations can be justified. However, for most tasks, standard functions from stirng
or bytes
packages are enough.
More information about bitwise tricks you can find here: