use pandas col to store % of values in dict format

I have a dataframe like as below


I would like to do the below

a) Groupby name

b) get the unique count (nunique) of id and PL for each name

With the help of this post, I managed to do the below

cols = ['id','PL']
for col in cols:
    d = {i: x.value_counts(normalize=True).to_dict() for i, x in cf.groupby('name')[col]}

But the above doesn’t work for multiple columns (purely due to my limitation)

Is there anyway to make it work for multiple columns? My real data has 5 million rows and I have store count values in a % format under dict (for around 5-6 columns). So, any elegant and efficient solution welcome

I expect my output to be like below

name  id_list          AL_list

A    {1:0.5,2:0.5}     {22:1}
B    {5:0.66,4:0.33}   {21:0.33,23:0.33,24:0.33}
C    {6:1.0}           {21:1}

UPDATE – incorrect output based on answer try

enter image description here

>Solution :

You can do groupby with value_counts then convert to dict

out = pd.concat([df.groupby('name')[x].value_counts(normalize=True).round(2).reset_index(level=0).groupby('name').agg(dict) for x in ['id','AL']],axis=1).add_suffix('_list').reset_index()
  name             id_list                         AL_list
0    A    {1: 0.5, 2: 0.5}                       {22: 1.0}
1    B  {5: 0.67, 4: 0.33}  {21: 0.33, 23: 0.33, 24: 0.33}
2    C            {6: 1.0}                       {21: 1.0}

Leave a Reply