Home Formatting a dataframe into a table with main headers and sub headers using pivot

Questions

Formatting a dataframe into a table with main headers and sub headers using pivot

October 17, 2023

I have a table in csv that looks like this as a df;

     Zone   Region  Total Area  Area in Zone
0   Zone1  Region1          10            60
1   Zone1  Region2           5             2
2   Zone1  Region3          10             5
3   Zone1  Region4           5             3
4   Zone2  Region1           5             1
5   Zone2  Region2           2             0
6   Zone2  Region3           5             0
7   Zone2  Region4           2             0
8   Zone3  Region1           7             6
9   Zone3  Region2           8             2
10  Zone3  Region3           7             3
11  Zone3  Region4           8             3
12  Zone4  Region1          80            70
13  Zone4  Region2           9             2
14  Zone4  Region3          80            80
15  Zone4  Region4           9             9

I want to pivot it and save it into a excel sheet and have it look like this…

The best I could get was this code

table = df.pivot_table(values=['Total Area', 'Area in Zone'], index=['Zone'], columns='Region')

That got me this result which is close but I cannot figure out how to finish the formatting

Area in Zone                         Total Area                        
Region      Region1 Region2 Region3 Region4    Region1 Region2 Region3 Region4
Zone                                                                          
Zone1            60       2       5       3         10       5      10       5
Zone2             1       0       0       0          5       2       5       2
Zone3             6       2       3       3          7       8       7       8
Zone4            70       2      80       9         80       9      80       9

My full dataset has 58 total zones and 16 regions. I’m trying to figure out a solution that will work with N number of regions and N number of Zones so if the dimension of the dataframe changes we can still output a table showing the two area values for as many Zones and Regions in the dataset.
Is pivot_table the right tool or should I be constructing the dataframes headers and indexes and then populating it with my table converted into an numpy array of values? I seen that approach but couldn’t get it to work because the example filled the values in with random numpy array values.

>Solution :

After pivot swap the levels and sort index

table = table.swaplevel(axis=1).sort_index(axis=1)

Region Region1                 Region2                 Region3                 Region4           
       Area in Zone Total Area Area in Zone Total Area Area in Zone Total Area Area in Zone Total Area
Zone                                                                                                  
Zone1          60.0       10.0          2.0        5.0          5.0       10.0          3.0        5.0
Zone2           1.0        5.0          0.0        2.0          0.0        5.0          0.0        2.0
Zone3           6.0        7.0          2.0        8.0          3.0        7.0          3.0        8.0
Zone4          70.0       80.0          2.0        9.0         80.0       80.0          9.0        9.0