Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Convert Pandas dataframe to nested JSON (without nesting as lists)

I’m aware there are other threads on this topic, but I’m running into an issue that no other answer seems to address.

Given the following Pandas dataframe:

a  b  c  d
a1 b1 c1 d1
a2 b2 c2 d2

I would like to create a nested JSON object with the following structure:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

[
    {
        "a": "a1",
        "b": "b1",
        "nested_group":
            {
                "c": "c1",
                "d": "d1"
            }
    },
    {
        "a": "a2",
        "b": "b2",
        "nested_group":
            {
                "c": "c2",
                "d": "d2"
            }
    }
]

The solution I’ve found in other threads is to use the following code:

j = (df.groupby(['a','b']) 
      .apply(lambda x: x[['c','d']].to_dict('records')) 
      .reset_index() 
      .rename(columns={0:'nested_group'}) 
      .to_json(orient='records'))

However, the issue I’m running into is that the above code places each nested_group object inside square brackets, like so:

"nested_group": [
    {
        "c": "c2",
        "d": "d2"
    }
]

The object I’m trying to generate is intended to be the payload for an API call, and unfortunately the square brackets around each inner dictionary cause the API to return an error. (The outermost brackets at the very top/bottom of the object are fine.) How can I make Python NOT treat the inner dictionaries as lists?

Code to reproduce the example dataframe:

import numpy as np
import pandas as pd

array = np.array([['a1', 'b1', 'c1', 'd1'], ['a2', 'b2', 'c2', 'd2']])
df = pd.DataFrame(data=array, columns=['a','b','c','d'])

Thank you in advance 🙂

>Solution :

Looking at the docs for to_dict it seems like we still have to use the records option, and if we assume it will always be a list of 1 element, just take the 0th element using your original code

>>> import numpy as np
>>> import pandas as pd
>>> array = np.array([['a1', 'b1', 'c1', 'd1'], ['a2', 'b2', 'c2', 'd2']])
>>> df = pd.DataFrame(data=array, columns=['a','b','c','d'])
>>> (df.groupby(['a','b']) 
      .apply(lambda x: x[['c','d']].to_dict('records')[0]) 
      .reset_index() 
      .rename(columns={0:'nested_group'}) 
      .to_json(orient='records'))
'[{"a":"a1","b":"b1","nested_group":{"c":"c1","d":"d1"}},{"a":"a2","b":"b2","nested_group":{"c":"c2","d":"d2"}}]'
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading