Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

What is the pythonic way to create a Dataframe from a list of Nested Dictionary Structures (with two levels)?

I am receiving requests in the following format (I cannot change the input request format):

{  "inputs":
    [
       {
           "TimeGenerated": "datetimestring",
           "counters": {
               "counter1": float_value,
               "counter2": float_value,
               "counter3": float_value
            }
      },
      {
           "TimeGenerated": "datetimestring",
           "counters": {
               "counter1": float_value,
               "counter2": float_value,
               "counter3": float_value
            }
      },
      {
           "TimeGenerated": "datetimestring",
           "counters": {
               "counter1": float_value,
               "counter2": float_value,
               "counter3": float_value
           }
      }
    ]
}

I want to create a DataFrame out of this dictionary with columns: TimeGenerated, counter1, counter2, counter3.

What is the most effective pythonic way to create a DataFrame out of this list of nested dictionaries?

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel


Possible Solution (Not the Most Efficient One)

The solution, I have found is:

x = []
for i in input_json['inputs']:
        counters = i['counters']                   # We do not want counters in the column headers. This returns the dictionary { "counter1": float_value, "counter2": float_value, "counter3": float_value}
        counters['_time'] = i['TimeGenerated']     # The idea to extract it and then add it to the common dictionary. Counters would now be like { "counter1": float_value, "counter2": float_value, "counter3": float_value, "_time": "datetimestring"}
        x.append(counters)                         # Create a list of such dictionaries (with single level dictionaries without any nesting)
in_df = pd.DataFrame(x)                            # Create a Dataframe from the list
in_df['_time'] = pd.to_datetime(in_df['_time'])    # To convert datetimestring to datetime.

But, I am sure there are more effective ways to achieve this!


Similar Questions (with different expected end-results)

Some other questions on StackOverflow that address similar concerns (but different results are expected). Adding them for the perusal of someone who has stumbled across this while actually searching for another end-result (Also, will serve as a good comparison point to work with Python Dictionaries, Lists and DataFrames and how they are inter-related).

  1. Python Dataframe contains a list of dictionaries, need to create new dataframe with dictionary items
  2. Create pandas dataframe from nested dict with outer keys as df index and inner keys column headers
  3. Create Dataframe from a nested dictionary

>Solution :

Assuming all the subobjects have the same structure, you can list the keys from the first and use those for the columns.

columns = ['TimeGenerated', *j['inputs'][0]['counters'].keys()]
df = pd.DataFrame([[t['TimeGenerated'], *t['counters'].values()] for t in j['inputs']], columns=columns)

Output

>>> df
    TimeGenerated  counter1  counter2  counter3
0  datetimestring   123.456   123.456   123.456
1  datetimestring   123.456   123.456   123.456
2  datetimestring   123.456   123.456   123.456
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading