How to merge rows in dictionary python using pandas

October 18, 2023

i have a dataframe like that

MV id   NAME   ADDRESS  DOC  DOCTYPE PHONE
1  100  Mark   Home     299  NI      {123,456}
2  100  John   Work     A123 Pass    {789,101}
3  100         Club

what i want to do is to merge the columns that has the same id into one cell in dictionary like this and make the key of a value in dictionary from another column

id   NAME              ADDRESS                          DOC                    PHONE
100  {1:Mark,2:John}   {1:'Home',2:'Work',3:'Club'}     {NI:'299',Pass:'A123'} {1:{123,456},2:{789,101}}

as u can see i used 2 columns as a key column mv used as a key in (Name,Address,Phone)
and i used column DOCTYPE as key in DOC so how can i do something like that i tried this

agg={'id':'first','NAME':dict,'ADDRESS':dict,'PHONE':dict}
    df_new=df.groupby(['CUSTOMER_CODE'],as_index=False).aggregate(agg)
    return df_new

but it gave me this output

id   Name             Address                 Phone
100  {0:Mark,1:John}  {0:Home,1:Work,2:Club}  {0:{123,456},1:{789,101}}

>Solution :

Assuming your empty cells are NaN, you can use a custom groupby.agg:

Let’s first see how we could aggregate with a single reference:

df.set_index('MV').groupby('id', as_index=False).agg(lambda s: s.dropna().to_dict())

Output:

    id                    NAME                            ADDRESS                    DOC               DOCTYPE                             PHONE
0  100  {1: 'Mark', 2: 'John'}  {1: 'Home', 2: 'Work', 3: 'Club'}  {1: '299', 2: 'A123'}  {1: 'NI', 2: 'Pass'}  {1: '{123,456}', 2: '{789,101}'}

generic answer for different references

Now we can adapt that to use different reference columns:

def f(s):
    return s.dropna().to_dict()

d = {'MV': ['NAME', 'ADDRESS', 'PHONE'],
     'DOCTYPE': ['DOC']
    }

out = (pd.concat([df.set_index(k).groupby('id')[v].agg(f)
                 for k, v in d.items()], axis=1)
         .reset_index()
      )

Output:

    id                    NAME                            ADDRESS                             PHONE                            DOC
0  100  {1: 'Mark', 2: 'John'}  {1: 'Home', 2: 'Work', 3: 'Club'}  {1: '{123,456}', 2: '{789,101}'}  {'NI': '299', 'Pass': 'A123'}