Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Group data in pivot Pandas

hello I made a table with the help of a pandas, then I created a pivot with two indexes., I would like to group this data like that, but one index was the header of another. Below I will show what is happening in the tables and what I would like the result
Table:

Name Lang Skill Corp
Michael java 2 Google
Piter C++ 3 Facebook
Cristiano python 5 Google
Michael java 1 Facebook
Piter C++ 2 Google
Cristiano python 3 Facebook
Michael java 4 Google
Piter C++ 5 Facebook
Cristiano python 1 Google
Michael python 2 Facebook

I used:

pivot = pd.pivot_table(df, values="Skill", index=["Corp", "Name"], columns = "Lang", aggfunc="sum")

and I have pivot:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

Corp Name C++ java python
Facebook Cristiano nan nan 3
Facebook Michael nan 1 2
Facebook Piter 8 nan nan
Google Cristiano nan nan 6
Google Michael nan 6 nan
Google Piter 2 nan nan

the result I would like:

Name C++ java python
Facebook sum_fb sum_fb sum_fb
Cristiano nan nan 3
Michael nan 1 2
Piter 8 nan nan
Google sum_google sum_google sum_google
Cristiano nan nan 6
Michael nan 6 nan
Piter 2 nan nan

Thank You in advance

>Solution :

You can aggregate sum by Corp level, which is first by GroupBy.sum, append index to MultiIndex with same values:

df1 = (pivot.groupby(level=0).sum()
            .assign(Name = lambda x: x.index)
            .set_index('Name', append=True))

Or use level=[0,0] for MultiIndex, only necessary set names by DataFrame.rename_axis:

df1 = pivot.groupby(level=[0,0]).sum().rename_axis(['Corp','Name'])
print (df1)         
Lang               C++  java  python
Corp     Name                       
Facebook Facebook  8.0   1.0     5.0
Google   Google    2.0   6.0     6.0

Then is appended pivot DataFrame by concat, but is necessary sorting for correct ordering by first level of MultiIndex with DataFrame.sort_index, remove first level Corp by DataFrame.droplevel and last convert Name to column with remove columns name Lang by DataFrame.rename_axis:

df = (pd.concat([df1, pivot])
        .sort_index(level=0, sort_remaining=False)
        .droplevel(0)
        .reset_index()
        .rename_axis(None, axis=1))
print (df)
        Name  C++  java  python
0   Facebook  8.0   1.0     5.0
1  Cristiano  NaN   NaN     3.0
2    Michael  NaN   1.0     2.0
3      Piter  8.0   NaN     NaN
4     Google  2.0   6.0     6.0
5  Cristiano  NaN   NaN     6.0
6    Michael  NaN   6.0     NaN
7      Piter  2.0   NaN     NaN
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading