Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Python itertools.groupby gives unexpected results when sorting by binary representation

I am trying to solve a simple problem using itertools.groupby: group the numbers from 0 to 7 according to the number of 1‘s in their binary representation. So I want to produce the mapping

{0: [0], 1: [1, 2, 4], 2: [3, 5, 6], 3: [7]}

But here is what I get from groupby:

>>> from itertools import groupby
>>> def key(i):
...     print(i, bin(i), bin(i).count('1'))
...     return bin(i).count('1')
>>> groups = {k: list(v) for k, v in groupby(range(8), key=key)}
0 0b0 0
1 0b1 1
2 0b10 1
3 0b11 2
4 0b100 1
5 0b101 2
6 0b110 2
7 0b111 3
>>> groups
{0: [0], 1: [4], 2: [5, 6], 3: [7]}

The result has me absolutely baffled. The print statements show that the individual calls to the key function behave as expected, and yet I loose the numbers 1, 2, 3 along the way.
It get’s even worse when I use e.g. 16:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

>>> {k: list(v) for k, v in groupby(range(16), key=lambda i: bin(i).count('1'))}
{0: [0], 1: [8], 2: [12], 3: [13, 14], 4: [15]}

I am hoping to understand how groupby arrives at this result, and to learn if their is a way to solve this using itertools. (I am not looking for a solution to the problem as such, only for a fancy generator solution using e.g. itertools.)

(I’ve tried this in python 3.9 and 3.10 so I’m fairly certain it is not a bug)

>Solution :

If you want to use groupby you need to sort input list first.

groups = {k: list(v) for k, v in groupby(sorted(range(8), key=key), key=key)}

Your generator discards old entries when same group is encountered later.

You are already using dict so you don’t need to use groupby at all

d = defaultdict(list)
for i in range(8):
    d[key(i)].append(i)
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading