Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Can I create a nested column pandas dataframe using a nested dictionary?

I am trying to think out of the box here, and my idea might be really bad. Feel free to point out better alternatives. I want to create a nested column pandas dataframe, both for visualization and analysis purposes. The output should look like this –

           Marks
  Physics   |      Biology
 Theo|Prac  |     Theo|Prac
  99 | 100  |      89 | 100 

My data is stored in the form of a nested dictionary –

nested_dict = {
    'Marks': {
        'Physics': {
            'Theo': 99,
            'Prac': 100
        },
        'Biology': {
            'Theo': 89,
            'Prac': 100
        }
    }
}

I think the above table looks great for visualization but I am not sure if it’ll make things easy for analysis. For analysis, I’d need to do operations on the subgroups, For eg: Physics percentage = (Theo + Prac)/200*100. Is a nested panda dataframe the best way to do the analysis?

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

Is there a way I can do that? Doing pd.DataFrame.from_dict(nested_dict) doesn’t seem to work. This is what I get –

                             Marks
Biology  {'Theo': 89, 'Prac': 100}
Physics  {'Theo': 99, 'Prac': 100}

>Solution :

Un-nesting can be achieved manually with concat and a dictionary comprehension, then unstack+to_frame and transposition:

out = (pd.concat({k: pd.DataFrame(v)
                  for k, v in nested_dict.items()},
                 axis=1)
         .unstack().to_frame().T
      )

You could also use json_normalize and convert the columns to MultiIndex:

out = pd.json_normalize(nested_dict)
out.columns = out.columns.str.split('.', expand=True)

Output:

    Marks                  
  Physics      Biology     
     Theo Prac    Theo Prac
0      99  100      89  100
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading