Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Transforming a multi array json data into a flatterned dataframe with python and pandas

I get an arrayed multi level JSON response with one level which is a date and another one which duplicats the data in a lower lever array from an API which looks like this:

{
   "2021-11-04": {
      "40-41 (25-27)": {
         "sales": 26,
         "balance": 480,
         "size_name": "40-41",
         "size_origin": "25-27"
      },
      "42-43 (27-29)": {
         "sales": 63,
         "balance": 817,
         "size_name": "42-43",
         "size_origin": "27-29"
      }
   },
   "2021-11-05": {
      "40-41 (25-27)": {
         "sales": 35,
         "balance": 445,
         "size_name": "40-41",
         "size_origin": "25-27"
      },
      "42-43 (27-29)": {
         "sales": 95,
         "balance": 725,
         "size_name": "42-43",
         "size_origin": "27-29"
      }
   }
}

But what i need is to make it not an array but a flatten objects to easily form a dataframe with pandas. How it can be done?

The needed result:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

{
  { 
    "day": "2021-11-04",
    "sales": 26,
    "balance": 480,
    "size_name": "40-41",
    "size_origin": "25-27"
  },
  {
    "day": "2021-11-04",
    "sales": 63,
    "balance": 817,
    "size_name": "42-43",
    "size_origin": "27-29"
   },
   { 
    "day": "2021-11-05",
    "sales": 35,
    "balance": 445,
    "size_name": "40-41",
    "size_origin": "25-27"
  },
  {
    "day": "2021-11-05",
    "sales": 95,
    "balance": 725,
    "size_name": "42-43",
    "size_origin": "27-29"
   }
}

I’m ok with transformation of it inside pandas and not while it’s in JSON format but I still don’t get it how can such structure can be transformed.

>Solution :

You can do it with a simple double loop

data = {
   "2021-11-04": {
      "40-41 (25-27)": {
         "sales": 26,
         "balance": 480,
         "size_name": "40-41",
         "size_origin": "25-27"
      },
      "42-43 (27-29)": {
         "sales": 63,
         "balance": 817,
         "size_name": "42-43",
         "size_origin": "27-29"
      }
   },
   "2021-11-05": {
      "40-41 (25-27)": {
         "sales": 35,
         "balance": 445,
         "size_name": "40-41",
         "size_origin": "25-27"
      },
      "42-43 (27-29)": {
         "sales": 95,
         "balance": 725,
         "size_name": "42-43",
         "size_origin": "27-29"
      }
   }
}

records = []
for date, date_dict in data.items():
    for rec_id, rec in date_dict.items():
        rec['day'] = date 
        records.append(rec)

Output

>>> records

[{'sales': 26,
  'balance': 480,
  'size_name': '40-41',
  'size_origin': '25-27',
  'day': '2021-11-04'},
 {'sales': 63,
  'balance': 817,
  'size_name': '42-43',
  'size_origin': '27-29',
  'day': '2021-11-04'},
 {'sales': 35,
  'balance': 445,
  'size_name': '40-41',
  'size_origin': '25-27',
  'day': '2021-11-05'},
 {'sales': 95,
  'balance': 725,
  'size_name': '42-43',
  'size_origin': '27-29',
  'day': '2021-11-05'}]

>>> pd.DataFrame(records)

   sales  balance size_name size_origin         day
0     26      480     40-41       25-27  2021-11-04
1     63      817     42-43       27-29  2021-11-04
2     35      445     40-41       25-27  2021-11-05
3     95      725     42-43       27-29  2021-11-05
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading