Jun 18th, 2021 - written by Kimserey with .

When working with cryptographic algorithm and hashes, it’s quite common to operate at the bit and byte level. For those situations, Python provides functionalities to convert `int`

to `byte`

and vice versa and bitwise operators to operate on bits. In today’s post we will look at the different bitwise operators available with examples.

In Python 3.x, bitwise operators can only used on bits (as the name indicates). But in most cases, methods take in and return `bytes`

type, in order to get to the bits, we can either select a single byte in `bytes`

are directly convert the whole `bytes`

to a int.

1

>>> value = b"12"

Here we create bytes string which by default is encoded in `utf-8`

. If you have doubt about how `utf-8`

works, you can refer to my previous post on Unicode explained.

The little `b`

in front of the string value indicates that the data is in `bytes`

and the string value that we see is the `utf-8`

encoded value of `"12"`

.

With the bytes, we can convert it to `int`

by either taking a slice of the `bytes`

or just converting the whole value with `int.from_bytes`

:

1
2
3
4
5

>>> value[0]
49
>>> int.from_bytes(value, byteorder="big")
12594

Taking `value[0]`

will return the decimal value of the character `1`

in `utf-8`

encoding. It would be `49`

as the unicode for `1`

would be `0x31`

which is `0011 0001`

hence `2^0 + 2^4 + 2^5 = 49`

.

We specify the `byteorder`

as big-endian to convert the decimal.

The endianness will define the order of the bytes; `big`

for big-endian means the left byte, the lowest in the byte array, will be the most significant byte of the word. The endiannes is important for the to and from functions converting from int to bytes and vice versa as the resulting `utf-8`

value would otherwise be reversed.

For example, if we were to convert back `12594`

with `little`

after previously having converted with `big`

, we’ll end up with `b"21"`

:

1
2
3
4
5

>>> (12594).to_bytes(2, byteorder="little")
b'21'
>>> (12594).to_bytes(2, byteorder="big")
b'12'

With the decimal value, we can then represent it in multiple format, binary, octal, hexadecimal:

1
2
3
4
5
6
7
8

>>> bin(int.from_bytes(value, byteorder="big"))
'0b11000100110010'
>>> oct(int.from_bytes(value, byteorder="big"))
'0o30462'
>>> hex(int.from_bytes(value, byteorder="big"))
'0x3132'

`0b`

identifies values displayed as binary (base 2 - bit 0 or 1), `0o`

identifies octal (base 8 - values from 0 to 7), `0x`

identifies hexadecimal (base 16 - values from 0 to F).

Now that we know how to convert from bytes to binary and display the binary format, we can start looking at bitwise operators.

The bitwise operators available in python are:

Operator | Definition |
---|---|

`<<` |
Bitwise left shift |

`>>` |
Bitwise right shift |

`&` |
Logical AND |

`\|` |
Logical OR |

`^` |
Logical XOR |

`~` |
Complement |

The left `<<`

and right `>>`

bit shifts are useful to move bits to the left or to the right. For example:

1
2
3
4
5

>>> 0b0101 << 1
10
>>> bin(0b0101 << 1)
0b1010

We shifted the bits by one position. Because integer can also be interchanged with binaries, we can specify a binary notation or hex notation:

1
2
3
4
5
6
7
8

>>> 0b0101 << 0x0f
163840
>>> bin(0b0101 << 0x0f)
'0b101000000000000000'
>>> 0x0f
15

We shifted the bits to the left 15 times.

The `&`

would be a logical AND as followed:

1
2
3
4
5

>>> 0b1001 & 0b1000
8
>>> bin(0b1001 & 0b1000)
'0b1000'

This can be useful to select parts of some bytes while masking the rest, for example:

1
2

>>> bin(0b10011111 & 0xf0)
'0b10010000'

With `& 0xf0`

, we do a AND with `1111 0000`

so essentially masking the lower 4 bits.

Similarly the `|`

operator would be:

1
2
3
4
5

>>> 0b1001 | 0b1000
9
>>> bin(0b1001 | 0b1000)
'0b1001'

OR is useful in situation were we want to construct an array, for example we can concatenate two 4 bits word by doing a left shift with `|`

:

1
2

>>> bin(0b1010 << 4 | 0b1111)
'0b10101111'

We concatenated our first word `1010`

with `1111`

by left shifting of 4 positions and executing a OR.

The `^`

executes an exclusive OR (XOR):

1
2

>>> bin(0b1001 ^ 0b1000)
0b1

And lastly `~`

would return the complement:

1
2
3
4
5

>>> bin(~0b00000010)
'-0b11'
>>> ~0b10
-3

The complement of `0b00000010`

being `-0b11`

might not have been what we expected, but this is in fact a two’s complement which allows to represent negative value hence the `-`

sign in from of `0b`

.

For NOT, we would have expected `0b11111101`

and we can actually get that value with `~0b00000010 & 0xff`

:

1
2
3
4
5

>>> ~0b00000010 & 0xff
253
>>> bin(~0b00000010 & 0xff)
'0b11111101'

And that concludes today’s post!

Today we looked at how we could manipulate bits and bytes in Python. We started to look at how we could convert bytes into bits and how we could display their representations. We then moved on to look at each bitwise operators provided in Python and looked at example where they could be used. I hope you liked this post and I see you on the next one!