Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How to split multiindex columns without creating 'nan' column name

I have a data frame with multi-index columns like the below (the data frame has been flattened from a nested dictionary)

Index(['A/service1/service2/200',
       ....
       'D/service1/service2/500/std'],)

Now when I try to split the columns using this line of code

df.columns = df.columns.str.split('/', expand=True)

It creates nan column names like below. I can’t rename or drop this ‘nan’ column.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

Index(['A','service1','service2','200', nan,
       ....
       'D','service1', 'service2', '500', 'std'],)

I intend to convert the data frame to a nested dictionary. Can anyone help?

>Solution :

You can use nested dictioanry comprehension with split nested keys:

c = ['A/service1/service2/200',
      'D/service1/service2/500/std']

df = pd.DataFrame( [[3296, 1000]], columns=c, index=['ts'])
print (df)

out = {k: {tuple(k1.split('/')): v1 for k1, v1 in v.items()}
                                    for k, v in df.to_dict('index').items()}
print (out)
{'ts': {('A', 'service1', 'service2', '200'): 3296, 
        ('D', 'service1', 'service2', '500', 'std'): 1000}}
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading