Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Pandas explode dictionary to row while maintaining multi-index

Having now checked a multitude of Stack Overflow threads on this, I’m struggling to apply the answers to my particular use case so hoping someone can help me on my specific problem.

I’m trying to explode data out of a dictionary into two separate columns while maintaining a multi-index.

Here is what I currently have:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

| short_url | platform | css_problem_files                                                    |
|-----------|----------|----------------------------------------------------------------------|
| /url_1/   | desktop  | {css_file_1: css_value, css_file_2: css_value, css_fle_3: css_value} |
|           | mobile   | {css_file_1: css_value, css_file_2: css_value, css_fle_3: css_value} |
| /url_2/   | desktop  | {css_file_1: css_value, css_file_2: css_value, css_fle_3: css_value} |
|           | mobile   | {css_file_1: css_value, css_file_2: css_value, css_fle_3: css_value} |

and here is what I would like to achieve:

| short_url | platform | css_file   | css_value |
|-----------|----------|------------|-----------|
| /url_1/   | desktop  | css_file_1 | css_value |
|           |          | css_file_2 | css_value |
|           |          | css_file_3 | css_value |
|           | mobile   | css_file_1 | css_value |
|           |          | css_file_2 | css_value |
|           |          | css_file_3 | css_value |
| /url_2/   | desktop  | css_file_1 | css_value |
|           |          | css_file_2 | css_value |
|           |          | css_file_3 | css_value |
|           | mobile   | css_file_1 | css_value |
|           |          | css_file_2 | css_value |
|           |          | css_file_3 | css_value |

The only thing I’ve come up with that’s remotely close to what I need is the below, however this is creating over 200K rows when I’d expect it to be only in the thousands (and I’ve not included platform yet):

m = pd.DataFrame([*df['css_problem_files']], df.index).stack()\
      .rename_axis([None,'css_files']).reset_index(1, name='pct usage')

out = df[['short_url']].join(m)

Any assistance or a point in the right direction would be greatly appreciated

>Solution :

If you turn the dictionaries into lists of key-value pairs, you can explode them and then transform the result into two new columns with .apply(pd.Series) (and rename them to your liking) like so:

df = (df
      .css_problem_files.apply(dict.items) # turn into key value list
      .explode() # explode
      .apply(pd.Series) # turn into columns
      .rename(columns={0: "css_file", 1: "css_value"}) # rename
      )
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading