I have a df something like below
Name Series
=============================
A A1
B B1
A A2
A A1
B B2
I need to convert the series to a list which should be assigned to each Name like a dict or json obj as something like below
{
"A": ["A1", "A2"],
"B": ["B1", "B2"]
}
So far I have tried using groupby, but it just groups everything a separate dict
test = df.groupby("Series")[["Name"]].apply(lambda x: x)
The above code gives an output as a df like
Series
Name
A 0 A1
2 A2
3 A1
B 1 B1
4 B2
Any help is much appreciated
Thanks,
>Solution :
First drop_duplicates to ensure having , then groupby.agg as list:
out = df.drop_duplicates().groupby('Name')['Series'].agg(list).to_dict()
Or with unique:
out = df.groupby('Name')['Series'].agg(lambda x: x.unique().tolist()).to_dict()
Output: {'A': ['A1', 'A2'], 'B': ['B1', 'B2']}
If you have other columns, ensure to only keep those of interest:
out = (df[['Name', 'Series']].drop_duplicates()
.groupby('Name')['Series'].agg(list).to_dict()
)
sorting the lists:
out = (df.groupby('Name')['Series']
.agg(lambda x: sorted(x.unique().tolist())).to_dict()
)
Example:
# input
Name Series
0 A Z1
1 B B1
2 A A2
3 A Z1
4 B B2
# output
{'A': ['A2', 'Z1'], 'B': ['B1', 'B2']}