extract values as columns into dataframe from same dataframe json string

Let have a data frame with json string like below

df = pd.DataFrame([{'sno':1, 'values':"[{'a':1, 'b':2,'c':3}]"},
                   {'sno':2, 'values':"[{'a':4, 'b':5,'c':6}]"}])

Need to get the json string column as new columns in the same dataframe. Tried the below code but, getting the error as "the JSON object must be str, bytes or bytearray, not Series"

df = df.assign(a=lambda x: (json.loads(x['values'])[0]['a']))

When we try to use the lambda it should be getting the values for that row only but json.loads(x[‘values’]) actually returning a series and this is causing the error.

Is there any other ways to get this achieved?

Expected dataframe like below

sno    a     b     c
------------------------
1      1     2     3
2      4     5     6

>Solution :

If I guess correctly, what you want is to extract the value of a considering a dictionary in a list.

I would use ast.literal_eval here:

from ast import literal_eval
df['a'] = df['values'].apply(lambda x: literal_eval(x)[0]['a'])

Or:

df['a'] = [literal_eval(x)[0]['a'] for x in df['values']]

Then it won’t matter whether or not you have valid JSON (double quotes and not single quotes).

Output:

   sno                  values  a
0    1  [{'a':1, 'b':2,'c':3}]  1
1    2  [{'a':4, 'b':5,'c':6}]  4

If you need all columns:

out = df.join(pd.DataFrame([literal_eval(x)[0] for x in df['values']],
                           index=df.index))

Output:

   sno                  values  a  b  c
0    1  [{'a':1, 'b':2,'c':3}]  1  2  3
1    2  [{'a':4, 'b':5,'c':6}]  4  5  6

Or to overwrite df and drop the original values:

df = df.join(pd.DataFrame([literal_eval(x)[0] for x in df.pop('values')],
                          index=df.index))

Output:

   sno  a  b  c
0    1  1  2  3
1    2  4  5  6

Used input:

df = pd.DataFrame({'sno': [1, 2],
                   'values': ["[{'a':1, 'b':2,'c':3}]",
                              "[{'a':4, 'b':5,'c':6}]"]})

Leave a Reply