I have a dataframe with a column2 which for each row have a list of dicts.
import pandas as pd
data = [{"id":1,
"column1":123,
"column2":[{"a":1}, {"b":"X"}, {"c":'2023-01-16'}]}]
df = pd.DataFrame(data)
# id column1 column2
# 1 123 [{'a': 1}, {'b': 'X'}, {'c': '2023-01-16'}]
I’m trying to create three new columns from the dicts to create:
#id column1 a b c
# 1 123 1 X 2023-01-16
I’ve tried this:
df = df.explode(column="column2")
# column1 column2
# 0 123 {'a': 1}
# 0 123 {'b': 'X'}
# 0 123 {'c': '2023-01-16'}
df["column2"].apply(pd.Series)
# 0 1 2
# 0 {'a': 1} {'b': 'X'} {'c': '2023-01-16'}
But I cant get it to work the way I want.
How can I solve this?
>Solution :
You could combine a merge of the dictionaries (e.g. with ChainMap) and json_normalize:
from collections import ChainMap
df = df.join(pd.json_normalize([dict(ChainMap(*l)) for l in df.pop('column2')]))
# or
# df = df.join(pd.json_normalize(df.pop('column2')
# .map(lambda x: dict(ChainMap(*x)))))
Output:
id column1 c b a
0 1 123 2023-01-16 X 1