Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

remove string from json row

I’m taking several columns from a data frame and adding them to a new column.

A B C
1 3 6
1 2 4 
4 5 0 

df['D'] = df.apply(lambda x: x[['C', 'B']].to_json(), axis=1)

I’m then creating a new data frame that locates the unique instances of df['A']:

df2 = pd.DataFrame({'A': df.A.unique()})

finally, I’m creating a new column in df2 that list the value of df['B'] and df['C']

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

df2['E'] = [list(set(df['D'].loc[df['A'] == x['A']]))
                     for _, x in df2.iterrows()]

but this is stringing each object:

A B C D
1 3 6 ['{"B":"3","C":6"}', '{"B":"2","C":4"}']

furthermore, when I dump this in JSON I get:

payload = json.dumps(data)

I get this result:

["{\"B\":\"3\",\"C\":"6"}", "{\"B\":\"2\",\"C\":"\4"}"]

but I’m ultimately looking to remove the string on the objects and have this as the output:

[{"B":"3","C":"6"}, {"B":"2","C":"4"}]

Any guidance will be greatly appreciated.

>Solution :

In your case do groupby with to_dict

out = df.groupby('A').apply(lambda x : x[['B','C']].to_dict('records')).to_frame('E').reset_index()
out
Out[198]: 
   A                                     E
0  1  [{'B': 3, 'C': 6}, {'B': 2, 'C': 4}]
1  4                    [{'B': 5, 'C': 0}]
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading