Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How to add a new row and new column to a multiindex Pandas dataframe?

I try to use .loc to create a new row and a new column to a multiindex Pandas dataframe, by specifying all the axis. The problem is that it creates the new index without the new column, and at the same time throws an obscur KeyError: 6.

How could I do that ? A one line solution whould be much appreciated.

> df
                   side    total    value
city   code type                             
NaN    NTE  urban  ouest   0.01949  391.501656

> df.loc[(np.nan, 'NTE', 'rural'), 'population'] = 1000
KeyError: 6

> df
                   side    total    value
city   code type                             
NaN    NTE  urban  ouest   0.01949  391.501656
NaN    NTE  rural    NaN       NaN         NaN

Now, when I try the same command again it complains the index doesn’t exist.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

> df.loc[(np.nan, 'NTE', 'rural'), 'population'] = 1000
KeyError: (nan, 'NTE', 'rural')

The desired output would be this dataframe:

                   side    total    value        population
city   code type                             
NaN    NTE  urban  ouest   0.01949  391.501656          NaN
NaN    NTE  rural    NaN       NaN         NaN         1000

>Solution :

Here is problem with missing values, possible hack solution with assign empty string and rename:

df.loc[('', 'NTE', 'rural'), 'population'] = 1000
print (df.index)
MultiIndex([(nan, 'NTE', 'urban'),
            ( '', 'NTE', 'rural')],
           names=['city', 'code', 'type'])

df = df.rename({'':np.nan}, level=0)

print (df.index)

MultiIndex([(nan, 'NTE', 'urban'),
            (nan, 'NTE', 'rural')],
           names=['city', 'code', 'type'])

print (df)
                  side    total       value  population
city code type                                         
NaN  NTE  urban  ouest  0.01949  391.501656         NaN
          rural    NaN      NaN         NaN      1000.0
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading