Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

MultiIndex names when using pd.concat disappeared

Consider the following dataframes df1 and df2:

df1: 
sim_names       Model 1          
signal_names     my_y1     my_y2
units               °C       kPa
(Time, s)                       
0.0           0.738280  1.478617
0.1           1.078653  0.486527
0.2           0.794123  0.604792
0.3           0.392690  1.072772 

df2: 
 Empty DataFrame
Columns: []
Index: [0.0, 0.1, 0.2, 0.3] 

As you see, df1 has three levels with names "sim_names", "signal_names" and "units".

Next, I want to concatenate the two dataframes, and therefore I run the following command:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

    df2 = pd.concat(
        [df1, df2],
        axis="columns",
    )

but what I get is the following:

 df2:
             Model 1          
              my_y1     my_y2
                 °C       kPa
(Time, s)                    
0.0        0.738280  1.478617
0.1        1.078653  0.486527
0.2        0.794123  0.604792
0.3        0.392690  1.072772 

As you see, the levels names are gone.

What should I do to keep the levels names of df1 in the resulting df2?

My wanted resulting df2 should be like the following:

df2: 
sim_names       Model 1          
signal_names     my_y1     my_y2
units               °C       kPa
(Time, s)                       
0.0           0.738280  1.478617
0.1           1.078653  0.486527
0.2           0.794123  0.604792
0.3           0.392690  1.072772 

I tried to pass names=["sim_names", "signal_names", "units"] as argument to pd.concat but I got the same wrong result as above.

>Solution :

I’m not sure but seems like this is the normal behaviour (see GH13475).

As a workaround, you can use rename_axis/names :

out = pd.concat(
        [df1, df2],
        axis="columns",
    ).rename_axis(df1.columns.names, axis=1) # <- added chain


Output :

print(out)

sim_names    Model 1      
signal_names   my_y1 my_y2
units             °C   kPa
(Time, s)                 
0.00            0.74  1.48
0.10            1.08  0.49
0.20            0.79  0.60
0.30            0.39  1.07
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading