Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

how to use pd.json_normalize to retrieve the data I need

I have this list/json in Python,

[{'id': 'TC2-FFA',
  'shortCode': 'TC2-FFA',
  'dataSet': {'datumPrecision': 2,
   'id': 'TC2_37',
   'shortCode': 'TC2_37',
   'shortDescription': 'Clean Continent to US Atlantic coast',
   'displayGroup': 'BCTI',
   'datumUnit': 'Worldscale',
   'data': [{'value': 156.11, 'date': '2023-03-06'}],
   'apiIdentifier': 'RDSX9KRCHQI9TVGID5O7XQGBP1KKBZ0F'},
  'datumUnit': 'WS',
  'datumPrecision': 3,
  'projectionStartOn': '2005-01-04T00:00:00',
  'projectionEndOn': '2023-03-06T00:00:00',
  'apiIdentifier': 'RPSBTGHKN64SV91SV9R3492RCH33D2OH'},
 {'id': 'TC2$-FFA',
  'shortCode': 'TC2$-FFA',
  'dataSet': {'datumPrecision': 2,
   'id': 'TC2_37',
   'shortCode': 'TC2_37',
   'shortDescription': 'Clean Continent to US Atlantic coast',
   'displayGroup': 'BCTI',
   'datumUnit': 'Worldscale',
   'data': [{'value': 156.11, 'date': '2023-03-06'}],
   'apiIdentifier': 'RDSX9KRCHQI9TVGID5O7XQGBP1KKBZ0F'},
  'datumUnit': '$/mt',
  'datumPrecision': 3,
  'projectionStartOn': '2010-05-10T00:00:00',
  'projectionEndOn': '2023-03-06T00:00:00',
  'apiIdentifier': 'RPSH1H9454DYUE7G8CLHVLFPJZ3BVM77'}]

how to use pandas.json_normalize to only retrieve the data ‘shortCode'(or ‘id’, as they seem to have the same value) and ‘data’ under ‘dataSet’ (the path is ‘dataSet’–‘data’–‘value’ and ‘date’)?

This is the desired dataframe

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

  shortCode  data.value   data.date
0   TC2-FFA      156.11  2023-03-06
1   TC2$-FFA      156.11  2023-03-06

I tried pd.json_normalize(lst_object, record_path=['dataSet', ['shortCode', ['data', 'value'], ['data', 'date']])
but failed

>Solution :

It can be confusing.

You want all values from dataSet.data so they are your records.

You can then use meta to bring in shortCode

pd.json_normalize(
   data, 
   record_path=['dataSet', 'data'], 
   meta=['shortCode']
)
    value        date shortCode
0  156.11  2023-03-06   TC2-FFA
1  156.11  2023-03-06  TC2$-FFA
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading