I have this list/json in Python,
[{'id': 'TC2-FFA',
'shortCode': 'TC2-FFA',
'dataSet': {'datumPrecision': 2,
'id': 'TC2_37',
'shortCode': 'TC2_37',
'shortDescription': 'Clean Continent to US Atlantic coast',
'displayGroup': 'BCTI',
'datumUnit': 'Worldscale',
'data': [{'value': 156.11, 'date': '2023-03-06'}],
'apiIdentifier': 'RDSX9KRCHQI9TVGID5O7XQGBP1KKBZ0F'},
'datumUnit': 'WS',
'datumPrecision': 3,
'projectionStartOn': '2005-01-04T00:00:00',
'projectionEndOn': '2023-03-06T00:00:00',
'apiIdentifier': 'RPSBTGHKN64SV91SV9R3492RCH33D2OH'},
{'id': 'TC2$-FFA',
'shortCode': 'TC2$-FFA',
'dataSet': {'datumPrecision': 2,
'id': 'TC2_37',
'shortCode': 'TC2_37',
'shortDescription': 'Clean Continent to US Atlantic coast',
'displayGroup': 'BCTI',
'datumUnit': 'Worldscale',
'data': [{'value': 156.11, 'date': '2023-03-06'}],
'apiIdentifier': 'RDSX9KRCHQI9TVGID5O7XQGBP1KKBZ0F'},
'datumUnit': '$/mt',
'datumPrecision': 3,
'projectionStartOn': '2010-05-10T00:00:00',
'projectionEndOn': '2023-03-06T00:00:00',
'apiIdentifier': 'RPSH1H9454DYUE7G8CLHVLFPJZ3BVM77'}]
how to use pandas.json_normalize to only retrieve the data ‘shortCode'(or ‘id’, as they seem to have the same value) and ‘data’ under ‘dataSet’ (the path is ‘dataSet’–‘data’–‘value’ and ‘date’)?
This is the desired dataframe
shortCode data.value data.date
0 TC2-FFA 156.11 2023-03-06
1 TC2$-FFA 156.11 2023-03-06
I tried pd.json_normalize(lst_object, record_path=['dataSet', ['shortCode', ['data', 'value'], ['data', 'date']])
but failed
>Solution :
It can be confusing.
You want all values from dataSet.data
so they are your records.
You can then use meta to bring in shortCode
pd.json_normalize(
data,
record_path=['dataSet', 'data'],
meta=['shortCode']
)
value date shortCode
0 156.11 2023-03-06 TC2-FFA
1 156.11 2023-03-06 TC2$-FFA