Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Python groupby function not recognising that there are keys that are identical that can be "grouped on"

I have a list which I am trying to group by the first element of the nested list. The problem is that my code is not recognising that there are only 5 different values among the grouped on element.

a = [['2.25.151989603747108360484758994222924880510', 1],
['2.25.23907329898781253437777953862543062317', 1],
['2.25.151989603747108360484758994222924880510', 2],
['2.25.23907329898781253437777953862543062317', 1],
['2.25.159215431212584126451402597802236328925', 1],
['2.25.339018106044083012102817776589396922392', 1],
['2.25.159215431212584126451402597802236328925', 1],
['2.25.23907329898781253437777953862543062317', 1],
['2.25.151989603747108360484758994222924880510', 1],
['2.25.159215431212584126451402597802236328925', 1],
['2.25.151989603747108360484758994222924880510', 1],
['2.25.339018106044083012102817776589396922392', 1],
['2.25.159215431212584126451402597802236328925', 1],
['2.25.151989603747108360484758994222924880510', 1],
['2.25.159215431212584126451402597802236328925', 1],
['2.25.339018106044083012102817776589396922392', 1],
['2.25.159215431212584126451402597802236328925', 1]]

Just as a simple test, I have run (thinking there may be something odd about the variable I’m grouping on:

for key, group in groupby(a, lambda x: str(x[0]).strip()):
     print(key)

I get this:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

2.25.151989603747108360484758994222924880510
2.25.23907329898781253437777953862543062317
2.25.151989603747108360484758994222924880510
2.25.23907329898781253437777953862543062317
2.25.159215431212584126451402597802236328925
2.25.339018106044083012102817776589396922392
2.25.159215431212584126451402597802236328925
2.25.23907329898781253437777953862543062317
2.25.151989603747108360484758994222924880510
2.25.159215431212584126451402597802236328925
2.25.151989603747108360484758994222924880510
2.25.339018106044083012102817776589396922392
2.25.159215431212584126451402597802236328925
2.25.151989603747108360484758994222924880510
2.25.159215431212584126451402597802236328925

If I do this:

r=[]
for m in a:
    for n in a:
        if m[0]==n[0]:
            r.append(m[0])
                     
set(r)

I get this

{'2.25.105201430514553352325644071061576888668',
 '2.25.151989603747108360484758994222924880510',
 '2.25.159215431212584126451402597802236328925',
 '2.25.23907329898781253437777953862543062317',
 '2.25.339018106044083012102817776589396922392'}

Which is correct. Why isn’t the groupby function working?

>Solution :

It groups by consecutive keys, not all key of a kind, as in SQL:

Make an iterator that returns consecutive keys and groups from the iterable.

If you sort the keys first it will work as you expect:

key = lambda x: str(x[0]).strip()
for key, group in groupby(sorted(a, key=key), key=key):
    print(key)

Output:

2.25.151989603747108360484758994222924880510
2.25.159215431212584126451402597802236328925
2.25.23907329898781253437777953862543062317
2.25.339018106044083012102817776589396922392
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading