Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How to preserve multiindex order on join in pandas?

Consider The following Example Code:

import pandas as pd

df1 = pd.DataFrame({'data1':[1,2,3,4]}, index=pd.MultiIndex.from_product([['a','b'],[1,2],], names=['index1','Index2']))


df2 = pd.DataFrame({'data2':[5,6]}, index=pd.MultiIndex.from_product([[1,2]], names=['Index2']))


df3 = df1.join(df2, how='left',sort=False)


If I print df1, The index is in Order as I expect it:

               data1
index1 Index2       
a      1           1
       2           2
b      1           3
       2           4

Now I want to add additional data with df2:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

        data2
Index2       
1           5
2           6

but after the merge with df2, the Order of the Index has changed. Index2 is now the first level. I explicitly tried to forbid it with using sort=False, but it still promotes the joined index to the first level:

print(df3)

               data1  data2
Index2 index1              
1      a           1      5
       b           3      5
2      a           2      6
       b           4      6

The pandas documentation states, that how='left' does accomplish that, but it does not seem to work.

Is there a way i can enforce the resulting index_columns of the merge to be in the same order as df1? Like this:

               data1  data2
Index1 index2              
a     1           1      5
b      1           3      5
a      2           2      6
b      2           4      6

>Solution :

You can swaplevel and reindex:

df3 = df1.join(df2, how='left').swaplevel().reindex(df1.index)

Or reorder_levels:

df3 = df1.join(df2, how='left').reorder_levels(df1.index.names).reindex(df1.index)

Output:

               data1  data2
index1 Index2              
a      1           1      5
       2           2      6
b      1           3      5
       2           4      6
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading