Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How to split dictionary column in dataframe and make a new columns for each key values

I have a dataframe which has a column containing multiple values, separated by ",".

id data
0   {'1':A, '2':B, '3':C}
1   {'1':A}
2   {'0':0}

How can I split up the keys-values of ‘data’ column and make a new column for each key values present in it, without removing the original ‘data’ column.

desired output.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

id data                   1   2   3   0
0   {'1':A, '2':B, '3':C} A   B   C   Nan
1   {'1':A}               A   Nan Nan Nan
2   {'0':0}               Nan Nan Nan 0

Thank you in advance :).

>Solution :

You’ll need a regular expression to convert the data into a format that can be parsed as JSON. Then, pd.json_normalize will do the job nicely:

df['data'] = df['data'].str.replace(r'(["\'])\s*:(.+?)\s*(,?\s*["\'}])', '\\1:\'\\2\'\\3', regex=True)

import ast
df['data'] = df['data'].apply(ast.literal_eval)

df = pd.concat([df, pd.json_normalize(df['data'])], axis=1)

Output:

>>> df
                             data    1    2    3    0
0  {'1': 'A', '2': 'B', '3': 'C'}    A    B    C  NaN
1                      {'1': 'A'}    A  NaN  NaN  NaN
2                      {'0': '0'}  NaN  NaN  NaN    0
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading