I have a list which I am trying to group by the first element of the nested list. The problem is that my code is not recognising that there are only 5 different values among the grouped on element.
a = [['2.25.151989603747108360484758994222924880510', 1],
['2.25.23907329898781253437777953862543062317', 1],
['2.25.151989603747108360484758994222924880510', 2],
['2.25.23907329898781253437777953862543062317', 1],
['2.25.159215431212584126451402597802236328925', 1],
['2.25.339018106044083012102817776589396922392', 1],
['2.25.159215431212584126451402597802236328925', 1],
['2.25.23907329898781253437777953862543062317', 1],
['2.25.151989603747108360484758994222924880510', 1],
['2.25.159215431212584126451402597802236328925', 1],
['2.25.151989603747108360484758994222924880510', 1],
['2.25.339018106044083012102817776589396922392', 1],
['2.25.159215431212584126451402597802236328925', 1],
['2.25.151989603747108360484758994222924880510', 1],
['2.25.159215431212584126451402597802236328925', 1],
['2.25.339018106044083012102817776589396922392', 1],
['2.25.159215431212584126451402597802236328925', 1]]
Just as a simple test, I have run (thinking there may be something odd about the variable I’m grouping on:
for key, group in groupby(a, lambda x: str(x[0]).strip()):
print(key)
I get this:
2.25.151989603747108360484758994222924880510
2.25.23907329898781253437777953862543062317
2.25.151989603747108360484758994222924880510
2.25.23907329898781253437777953862543062317
2.25.159215431212584126451402597802236328925
2.25.339018106044083012102817776589396922392
2.25.159215431212584126451402597802236328925
2.25.23907329898781253437777953862543062317
2.25.151989603747108360484758994222924880510
2.25.159215431212584126451402597802236328925
2.25.151989603747108360484758994222924880510
2.25.339018106044083012102817776589396922392
2.25.159215431212584126451402597802236328925
2.25.151989603747108360484758994222924880510
2.25.159215431212584126451402597802236328925
If I do this:
r=[]
for m in a:
for n in a:
if m[0]==n[0]:
r.append(m[0])
set(r)
I get this
{'2.25.105201430514553352325644071061576888668',
'2.25.151989603747108360484758994222924880510',
'2.25.159215431212584126451402597802236328925',
'2.25.23907329898781253437777953862543062317',
'2.25.339018106044083012102817776589396922392'}
Which is correct. Why isn’t the groupby function working?
>Solution :
It groups by consecutive keys, not all key of a kind, as in SQL:
Make an iterator that returns consecutive keys and groups from the iterable.
If you sort the keys first it will work as you expect:
key = lambda x: str(x[0]).strip()
for key, group in groupby(sorted(a, key=key), key=key):
print(key)
Output:
2.25.151989603747108360484758994222924880510
2.25.159215431212584126451402597802236328925
2.25.23907329898781253437777953862543062317
2.25.339018106044083012102817776589396922392