Bit Manipulation

Previous lesson: Sorting

Following lesson: Binary Search

Contents

Why to bit?
Some definitions
Bit number representation
Bitwise operations
Bitset operations
Equations!
Bits AND C++
Problems to practice!

Why to bit?

Bit manipulation is a very common topic to know about, not only because sometimes there appears some bit manipulation problems in contest, but it is involved in advanced topics such as Bitmask DP, BIT, Trie...

But in general, knowing how to bit helps to unlock ideas related to logarithmic complexities. The proper idea of being able to reach \(10^9\) things in \(log_2(10^9) \approx 30\) things is, itself, a powerful idea; and can be applied in other algorithms such as binary search, binary lifting in LCA and some other binary stuff...

Learning how does bits works will upgrade your mind.

Some definitions

While working with bits there appear some common words that might be confusing if are heard by the first time, so better know them from beforehand.

Bit manipulation: The act of algorithmically manipulating bits or binary digits, general work and usage that involves all kinda level-bit things: counting bits turned on, setting bits, clearing bits, toggling bits, checking parity, checking if a number is a power of two...
Bitmask: A number that is used to represent a set of bits, where each bit corresponds to a specific element in the set. This mask made out of bits is a binary pattern used to manipulate or retrieve information about a certain data.
Bitwise operations: Operations that directly manipulate bits, such as AND, OR, XOR, NOT, and bit shifts.

Now let's put some definitions over the table. A bit is the minimal unit of information. We're talking about a binary digit because it can only take two values: 0 or 1. A bit is the representation of a state, either on or off, true or false, yes or no.

With one bit we can do just less than nothing, but if we put some bunch of them together, we can do a lot of funny things, and we can represent a lot of information if we know how to use them the right way.

Bit number representation

How does our computer stores numbers?
How can possibly the number 9223372036854775807 fit in a 64 integer representation? How can fit that number into 64 positions?
The answer is that our computer doesn't store numbers in decimal representation, but in binary representation. In binary representation, each digit can only be a 0 or a 1, and each position represents a power of 2.

So, we can express any number as a sum of powers of 2.

Let's try to count the most numbers we can with our hands.
What we usually would do is to araise one finger if we want to count to one more quantity. For the 1, I araise 1 finger. For the 2, I araise 2 fingers, and so on. \[ (fingers) = 00001 = 1 \] \[ 00011 = 2 \] \[ 00111 = 3 \]
But what if we start counting from the nothing. Zero fingers araised, zero quantity. \[00000 = 0\] If I arise one finger, I have the number one. \[ 00001 = 1 \] Seems the same, but when I araise the second finger, what if I araise it, and then put down the first one and I called it number two? \[ 00010 = 2 \] Now I can araise again the first finger, and call it number three. We take one more number with two fingers out of nothing! \[ 00011 = 3 \] If I keep going the same way, now I can araise the next finger, and pull the others and call that a new number. \[ 00100 = 4 \] And we can keep going all the way with the rest of our fingers: \[ 00101 = 5 \] \[ 00110 = 6 \] \[ 00111 = 7 \] \[ 01000 = 8 \] \[ 01001 = 9 \] \[ 01010 = 10 \] \[ \cdots \] If you think about it, our fingers also can be in two states, either araised, or pulled. Two states, just as a bit. In definition, we could call our hand a set of bits, a set of five bits.

We found a better method for counting. If you think about it, every time we arise a new finger, we duplicate the number we're able to count to, because we can do all the combinations done before, but counting one more finger.
If you think about it, we counted with the two first fingers until number 3, but all the other fingers were pulled. If we araise one more finger, we still can do every combination, but now counting the new finger.

Basically, that's how counting numbers with bits works. You can add more fingers to count to greater numbers, and that's exactly how our computer works.

If you noticed, every time we araise a new finger we had a very singular numbers: 1, 2, 4, 8... (check the representation of those numbers above)

Counting this way leads us to assign a certain weight to evey finger (since we can see its individual value, the number we create when only a certain finger is araised, what we've done above).
If we rewrite it in a fancier way we have: \[ 2^4 = 16, 2^3 = 8, 2^2 = 4, 2^1 = 2, 2^0 = 1 \] Let's sum all those numbers and voilá! Now we know that with our five fingers, we can count up to 31.

But let's go farther, what if we have more than five fingers, what if we had \(i\) fingers? Well, we can easily notice that, the value of a number, is the sum of all the numbers at its left. With this, we can also notice that we can form every number from 0 to \(2^i\), but the \(2^i\) itself. For the way we were thinking, if we had all the fingers arised, and we pull them down and arise the next one, that'll be the inmediate next number.

Counting with bits is nothing, but doing the sum of powers of two.

If you think about it, we're counting on a binary system, a numerical system which base is 2, instead of 10 as we usually know.

This leads us to think about a value based on a position.
If we think about our decimal system, we have then symbols denoting numbers: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9.
But when we need to count to numbers greaters than 9, we use another position: 10.
The number 10 is not a symbol itself, it's better to say that it is a combination of the past symbols, placed in a specific way.

So, when we need to count up to the number of our base (number 10 in this case), we cannot represent it as a symbol itself, we use another position, forming the number one-zero = 1 0.
This leads us to think about it as a sum of powers of ten. Its well known that 100 (one-zero-zero) is the same as saying \(10^2\), and 10 (one-zero) is the same as \(10^1\), and 0 (zero) is the same as \(10^0\).

If you think about it, we have invented nothing with our way to cleverly count with our fingers, the same properties from the sum of powers remains, with the difference that in base two, we only have two symbols, 0 and 1. But every time we need to count one more number, we add one position, just like when we're usually counting, on base 10.

But with our new definition of positional value we can develop some other concepts.

First, let's call our bunch of bits (initially, the fingers of our hand) as a binary string. This makes sense because its a string formed only by 0's and 1's (binary states, binary values). If you want to, this is a bunch of bits, a set of bits, we could also call it a bitset.

Now, if we have a position value, we also have a place with the least and with the greatest value. The right-most place is the least one, because it can takes the value of 1. And the left-most place is the most significant one, because it can takes tha value of \(2^n\).
But these places can be empty if our numbers does not have a 1 on them. We can have any combination of bits and every single one of it will be a different number, so let's call least significant bit or LSB to the turned one bit at the right-most place of our number; and most significant bit or MSB to the turned on bit at the left-most place.

Let's put all this things together in a formal way.

A binary string \(S\) of size \(n\), is a string where each character of it \(s_i \in \{0, 1\} \; \forall \; i \in [0, n-1]\). So the decimal value of every bit is denoted by \( s_i \cdot 2^i \). The decimal value of its binary representation can be obtained as \( \sum_{i=0}^{n-1} s_i \cdot 2^i \), from the LSB.

Bitwise operations

So we have a brand new type of data: bits. And we can do a lot of things with them, let's start with the operations we can do with them, mainly there are six which are explained as follows, the symbol in front of the name of the operation corresponds to its C++ operator.

AND (\(\&\))
Binary operator, receives two entries (\(A\),\(B\)) and calculates only one as exit.
Returns 1 if both \(A\) and \(B\) are 1, otherwise returns 0.
Other ways to represent AND bitwise operation in literature and references are: \( \cdot, \land, \&, \cap, \text{AND}, \text{conjunction} \).

A	B	A & B
0	0	0
0	1	0
1	0	0
1	1	1

OR ( \(|\) )
Binary operator, receives two entries (\(A\),\(B\)) and calculates only one as exit.
Returns 1 if either \(A\) or \(B\), or both are 1, otherwise returns 0. Other ways to represent OR bitwise operations in literature and references are: \( +, \lor, \cup, \text{OR}, \text{disjunction} \).

A	B	A \| B
0	0	0
0	1	1
1	0	1
1	1	1

XOR (^)
Exclusive or.
Binary operator, receives two entries (\(A\),\(B\)) and calculates only one as exit.
Returns 1 if \(A\) and \(B\) are different, otherwise returns 0. This one, or that one, but not both.
Other ways to represent XOR bitwise operations in literature and references are: \( \oplus, \text{XOR}, \text{exclusive or} \).

A	B	A ^ B
0	0	0
0	1	1
1	0	1
1	1	0

NOT (~)
Unary operator, receives one entry (\(A\)) and calculates only one as exit.
Returns 1 if \(A\) is 0, otherwise returns 0.
Returns the negation of \(A\).
Other ways to represent NOT bitwise operations in literature and references are: ~,\(\neg, \bar{A}, \text{NOT}, \text{complement} \).

A	~A
0	1
1	0

Left shift (<<)
Binary operator, receives two entries (\(A\), \(n\)) and calculates one as exit.
Shifts the bits of \(A\) to the left by \(n\) positions.
Fills the vacated bits on the right with 0.
Looking out on decimal value of the bitset, it multiplies \(A\) per two \(n\) times, this is, multiplies \(A \cdot 2^n\).

\(A_2\)	\(A_{10}\)	n	A << n	\(A_{10}\)
1	1	1	10	2
11	3	1	110	6
101	5	1	1010	10
1	1	3	1000	8
1011101	93	2	101110100	372

Right shift (>>)
Binary operator, receives two entries (\(A\), \(n\)) and calculates one as exit.
Shifts the bits of \(A\) to the right by \(n\) positions.
Fills the vacated bits on the left with 0.
Looking out on decimal value of the bitset, it divides \(A\) over two \(n\) times, this is, divides \( \lfloor \frac{A}{2^n} \rfloor \).

\(A_2\)	\(A_{10}\)	n	A >> n	\(A_{10}\)
1	1	1	0	0
11	3	1	1	1
101	5	1	10	2
1000	8	3	1	1
1000001111	527	3	1000001	65

These operations are called bitwise operations.

The last thing we need to know is that, as operators, these have an identity element, an element that, if you operate them with a number \(A\), the result is \(A\) itself.
The identity for the algebraic sum is the zero, because \(A + 0 = A\), the identity for algebraic multiplication is one, because \(A * 1 = A\).

Is ease to see then, that every single one of our bitwise has an identity, also, as we defined before, our element must be on the form \(I \in \mathbb{B}\). For this to be clear let's try this concept between bitsets, and not only bits.

If we had an arbitrary number \(A = 10100110011010\), what could be the element needed for it to be the same after AND operation?
In the first case, if we choose the zero, we would had:

10100110011010

&00000000000000

00000000000000

This is, operating bit per bit following the rules we've seen before, the AND of a number with a zero, will always be a zero no matter what.

On the other side:

10100110011010

&11111111111111

10100110011010

Because an AND will be 1, only when \(a_i\) is also 1, here's our identity for AND operation, the 1-filled bitset.

Now let's try it the same way for OR operation:

10100110011010

|00000000000000

10100110011010

|11111111111111

11111111111111

This is, if we have at least one 1 between both operands, the result will be 1 too. This means that, if we OR any number with a 1-filled number, the result will be 1-filled number, no matter what.

But we also found our identity element, is easy to see that, for OR operation, our identity is 0.

Now let's try it the same way for XOR operation:

10100110011010

^00000000000000

10100110011010

^11111111111111

01011001100101

XOR is interesting for its properties. We could see that, exactly the same as OR, operate \(A\) with 0 will result it the name \(A\) itself. But now we can see that, when we XOR with 1-filled number, the result is the flipping of the bits. Because, in addition to the above, if we have two 1's the result is zero.

In conclusion, \(A \oplus 0 = A\), our identity element then is 0. But also \(A \oplus \text{1-filled number} = \bar{A} \).

Since NOT is an unary operand, we cannot establish this same idea, but in regret, we can say that the negation of our negation results the original: \( \bar{\bar{A}} = A\), in other notation: \(\sim(\sim A) = A\). As can be seen following:

\(\neg\)10100110011010

01011001100101

\(\neg\)01011001100101

10100110011010

Which as we can see, switches bits every time.

Finally, bit shifting on definition shifts the bits of \(A\) to the right/left by \(n\) positions. So, if you shift a \(A\) by 0 positions, you'll get the same and we could say that's it identity elemtn.

In conclusion:

Operation	Symbol	Identity
AND	&	1
OR	\|	0
XOR	^	0
NOT	~	\(\not\exists\)
Left shift	<<	0
Right shift	>>	0

Last thing that must be cleared is that we've been using the term 1-filled number to denote... a number filled with 1's, but how can we make it using number bit representations?

Let \(|A|\) be the size (number of characters) of \(A\), then we could make the 1-filled number for \(A\) as \( (1 << |A|) - 1 \).

Bitset operations

Following bitwise is funny and important, but only doing the things we know won't let us solve problems that involves bit manipulation. For doing that, there's a set of certain operations we're interested to work with.
Most of them involves both bitmask and bitwise. You can peek information from a number by proposing a certain bitmask and operating them both with bitwise.
The bitmask part is basically all we've been doing in Bitwise operations section, combining that in a clever way let us do the following operations:

Get bit: (n & (1 << i)) != 0
Set bit: n |= (1 << i)
Clear bit: n &= ~(1 << i)
Toggle bit: n ^= (1 << i)
Get LSB: n & -n
Clear LSB: n &= (n - 1)

As we've seen, the number's bit representations allow us to check bitsets on its own. But there's a powerfull idea on this, we can see the bits on a number, literally, as elements on a Set.
Given this representation, bitwise turns into set operations.

Subset check: (A & B) == B
Set union: A | B
Set intersection: A & B
Set difference: A & ~B
Toggle subset: A ^ B

Equations!

There are some interesting equations that involves bitwise operations, and that are very useful in competitive programming.

Some properties of bitwise operations:

\( a \; | \; b = a \oplus b + a \; \& \; b \)
\( a \oplus (a \; \& \; b) = (a \; | \; b) \oplus b \)
\( b \oplus (a \; \& \; b) = (a \; | \; b) \oplus a \)
\( (a \; \& \; b) \oplus (a \; | \; b) = a \oplus b \)

In addition and substraction:

\( a + b = (a \; | \; b) + (a \; \& \; b) \)
\( a + b = a \oplus b + 2 (a \; \& \; b) \)
\( a - b = (a \oplus (a \; \& \; b)) - ((a \; | \; b) \oplus a) ) \)
\( a - b = ((a \; | \; b) \oplus b) ) - ( (a \; | \; b) \oplus a ) \)
\( a - b = (a \oplus (a \; \& \; b)) - (b \oplus (a \; \& \; b) ) \)
\( a - b = ((a \; | \; b) \oplus b) ) - (b \oplus (a \; \& \; b) ) \)

Bits AND C++

C++ has a lot of support while working with bits, more than the bitwise operators, it has built-in functions and a library to work with bitsets.

The built-in functions included in header bit are implemented since C++20:

std::popcount(x) - Returns the number of bits 1 in x.
std::has_single_bit(x) - Checks if a number is an integral power of two.
std::bit_ceil(x) - Finds the smallest integral power of two not less than the given value.
std::bit_floor(x) - Finds the largest integral power of two not greater than the given value.
std::bit_width(x) - Finds the smallest number of bits needed to represent the given value.
std::countl_zero(x) - Counts the number of consecutive 0 bits, starting from the MSB.
std::countl_one(x) - Counts the number of consecutive 1 bits, starting from the MSB.
std::countr_zero(x) - Counts the number of consecutive 0 bits, starting from the LSB.
std::countr_one(x) - Counts the number of consecutive 1 bits, starting from the LSB.
__builtin_ctz(x) - Count the number of trailing zeros in x.
__builtin_clz(x) - Count the number of leading zeros in x.

Also it is important to know the usage of Bitset class.

Problems to practice!

Codeforces 106073C - Collatz polynomial

Codeforces 105230B - Card Game