Converting dictionary of lists of dictionaries to a dataframe

February 22, 2023

Say I have a dict defined as:

dict = {'1': [{'name': 'Hospital 0',
               'students': 5,
               'grad': 71},
                    
              {'name': 'Hospital 1',
               'students': 8,
               'grad': 74}],
        
        '2': [{'name': 'Hospital 0',
               'students': 11,
               'grad': 72}]
                    
               {'name': 'Hospital 1',
               'students': 10,
               'grad': 78}]}

Suppose I want to make a dataframe from this formatted as follows:

step	name	students	grad
1	Hospital 0	5	71
1	Hospital 1	8	74
2	Hospital 0	11	72
2	Hospital 1	10	78

Do you guys have any ideas?

>Solution :

Here is an approach using json_normalize() Note: I am using data as variable name instead of dict which is python built-in function.

from pandas import json_normalize
import pandas as pd 

dfs = [json_normalize(data[key]).assign(step=key) for key in data if "name" in data[key][0]]
df = pd.concat(dfs, ignore_index=True)
df = df[["step", "name", "students", "grad"]]
print(df)

  step        name  students  grad
0    1  Hospital 0         5    71
1    1  Hospital 1         8    74
2    2  Hospital 0        11    72
3    2  Hospital 1        10    78