Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How to change the grouping of a column multi-index in a pandas pivot table?

Let’s say I have a dataframe like this:

data = {'City': ['Rochester', 'Anaheim', 'Toledo', 'Rochester', 'Anaheim', 'Anaheim', 'Toledo', 'Rochester', 'Rochester', 'Rochester', 'Toledo', 'Toledo', 'Toledo', 'Anaheim'],
        'PersonID': [4930, 7343, 4368, 6909, 4574, 4086, 5024, 3642, 9997, 4745, 1207, 6081, 7832, 6309],
        'MoneySpent': [100, 1710, 20, 910, 2040, 1100, 490, 70, 1940, 100, 1240, 80, 1420, 2090],
        'StayDuration': ['< 2 days', '2-7 days', '2-7 days', '7-30 days', '7-30 days', '< 2 days', '2-7 days', '7-30 days', '7-30 days', '2-7 days', '7-30 days', '< 2 days', '< 2 days', '7-30 days']
       }

df = pd.DataFrame(data)
    City        PersonID    MoneySpent  StayDuration
0   Rochester   4930        100         < 2 days
1   Anaheim     7343        1710        2-7 days
2   Toledo      4368        20          2-7 days
3   Rochester   6909        910         7-30 days
4   Anaheim     4574        2040        7-30 days
5   Anaheim     4086        1100        < 2 days
6   Toledo      5024        490         2-7 days
7   Rochester   3642        70          7-30 days
8   Rochester   9997        1940        7-30 days
9   Rochester   4745        100         2-7 days
10  Toledo      1207        1240        7-30 days
11  Toledo      6081        80          < 2 days
12  Toledo      7832        1420        < 2 days
13  Anaheim     6309        2090        7-30 days

I am then building a pivot table to show the number of people and their total expenses by stay duration for each city:

pv = pd.pivot_table(df,
                    index='City',
                    columns='StayDuration',
                    values=['PersonID', 'MoneySpent'],
                    aggfunc={'PersonID': 'count', 'MoneySpent': 'sum'}
                   )

What I’m seeing is metrics at the first level (headcount or expenses), then categories inside them:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

                                      MoneySpent                            PersonID
StayDuration    2-7 days    7-30 days   < 2 days    2-7 days    7-30 days   < 2 days
City                        
Anaheim         1710        4130        1100        1           2           1
Rochester       100         2920        100         1           3           1
Toledo          510         1240        1500        2           1           2

What I want is to have categories first, and metrics inside them, like this:

            2-7 days                7-30 days               < 2 days    
            PersonID   MoneySpent   PersonID   MoneySpent   PersonID   MoneySpent  
Anaheim     1          1710         2          4130         1          1100
Rochester   1          100          3          2920         1          100
Toledo      2          510          1          1240         2          1500

Which is, incidentally, the default view for an Excel pivot table.

It’s taking me surprisingly long to figure out how to make Python produce the same result. Is it possible to change the order the columns are grouped in?

>Solution :

As far as I know, pandas pivot will always sort the columns in that fashion. You will require some manipulations to get the output you need:

pv.swaplevel(0,1,axis=1).sort_index(axis=1).reindex(['PersonID', 'MoneySpent'], level=1, axis=1)

Output:

StayDuration 2-7 days            7-30 days            < 2 days           
             PersonID MoneySpent  PersonID MoneySpent PersonID MoneySpent
City                                                                     
Anaheim             1       1710         2       4130        1       1100
Rochester           1        100         3       2920        1        100
Toledo              2        510         1       1240        2       1500
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading