Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Converting a Dictionary to DataFrame in Python – 3 Keys and 1 Value Depth

Further on from this post here

How do I amend this solution for 1 more Key depth; or n-keys?


I have a dictionary of a static structure:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

Key: Key: Key: Value

Example Dictionary:

{
    "id_1": {   
            "Emissions": {
                "305-1": [
                    "2014_249989",
                    "2015_339998",
                    "2016_617957",
                    "2017_827230"
                ],
                 "305-2": [
                    "2014_33163",
                    "2015_64280",
                    "2016_502748",
                    "2017_675091"
                ],
            },
            "Effluents and Waste": {
                "306-1": [
                    "2014_143.29",
                    "2015_277.86",
                    "2016_385.67",
                    "2017_460.6"
                ],
                "306-2": "blah blah blah",
             }
        }
}

I want a DataFrame of this structure:

Grand Key | Parent Key | Child Key | Child Value
Grand Key | Parent Key | Child Key | Child Value
Grand Key | Parent Key | Child Key | Child Value
Grand Key | Parent Key | Child Key | Child Value

Example Desired DataFrame:

id_1 | Emissions | 305-1 | ["2014_249989", "2015_339998", "2016_617957", "2017_827230"]
id_1 | Emissions | 305-2 | ["2014_33163", "2015_64280", "2016_502748", "2017_675091"]
id_1 | Effluents and Waste| 306-1 | ["2014_249989", "2015_339998", "2016_617957", "2017_827230"]
id_1 | Effluents and Waste | 306-2 | blah blah blah

Attempted Solution:

Many attempts, all similar to having an additional for-loop.

data = [[key, ikey, jkey, value] for key, values in data.items() for ikey, value in values.items() for jkey, value in values.items()]

Please let me know if there are further details/ nuances I could clarify.

>Solution :

Try:

import pandas as pd

data = {
    "id_1": {
        "Emissions": {
            "305-1": ["2014_249989","2015_339998","2016_617957","2017_827230"],
            "305-2": ["2014_33163","2015_64280","2016_502748","2017_675091"],
        },
        "Effluents and Waste": {
            "306-1": ["2014_143.29","2015_277.86","2016_385.67","2017_460.6"],
            "306-2": "blah blah blah",
        }
    }
}


def nested_items(d, path=None):
    for key, value in d.items():
        if isinstance(value, dict):
            yield from nested_items(value, path=[key] if path is None else path + [key])
        else:
            yield path + [key], value


res = pd.DataFrame([[*path, value] for path, value in nested_items(data)])
print(res)

Output

      0  ...                                                  3
0  id_1  ...  [2014_249989, 2015_339998, 2016_617957, 2017_8...
1  id_1  ...  [2014_33163, 2015_64280, 2016_502748, 2017_675...
2  id_1  ...  [2014_143.29, 2015_277.86, 2016_385.67, 2017_4...
3  id_1  ...                                     blah blah blah

[4 rows x 4 columns]
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading