How do I mutate a Pandas DataFrame with a series of dictionaries.
Given the following DataFrame:
data = [['tom', 10], ['nick', 15], ['juli', 14]]
df = pd.DataFrame(data, columns = ['Name', 'Age'])
# add dict series
df = df.assign(my_dict="{}")
df.my_dict = df.my_dict.apply(json.loads)
| Name | Age | my_dict |
|---|---|---|
| tom | 10 | {} |
| nick | 15 | {} |
| juli | 14 | {} |
How would I operate on column my_dict and mutate it as follows:
Age > 10
| Name | Age | my_dict |
|---|---|---|
| tom | 10 | {"age>10": false} |
| nick | 15 | {"age>10": true} |
| juli | 14 | {"age>10": true} |
And then mutate again:
Name = "tom":
| Name | Age | my_dict |
|---|---|---|
| tom | 10 | {"age>10": false, "name=tom": true} |
| nick | 15 | {"age>10": true, "name=tom", false} |
| juli | 14 | {"age>10": true, "name=tom", false} |
I’m interested in the process of mutating the dictionary, the rules are arbitrary examples.
>Solution :
You can use:
df['my_dict'] = df.apply(lambda x: x['my_dict'] | {'Age': x['Age'] > 10}, axis=1)
print(df)
# Output
Name Age my_dict
0 tom 10 {'Age': False}
1 nick 15 {'Age': True}
2 juli 14 {'Age': True}
Add a new condition:
df['my_dict'] = df.apply(lambda x: x['my_dict'] | {'Name': x['Name'] == 'tom'}, axis=1)
print(df)
# Output
Name Age my_dict
0 tom 10 {'Age': False, 'Name': True}
1 nick 15 {'Age': True, 'Name': False}
2 juli 14 {'Age': True, 'Name': False}
Obviously if you want to convert to json, use:
>>> df['my_dict'].apply(json.dumps)
0 {"Age": false, "Name": true}
1 {"Age": true, "Name": false}
2 {"Age": true, "Name": false}
Name: my_dict, dtype: object