Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Getting ValueError – non-unique multi-index when using explode() in Pandas

I want to split two columns at their comma and bring them back to the original pandas dataframe. I tried to explode() but I got an error with ValueError: cannot handle a non-unique multi-index! I wonder how I can overcome this error.

import pandas as pd
data = {'fruit_tag': {0: 'apple, organge', 1: 'watermelon', 2: 'banana', 3: 'banana', 4: 'apple, banana'}, 'location': {0: 'Hong Kong , London', 1: 'New York, Tokyo', 2: 'Singapore', 3: 'Singapore, Hong Kong', 4: 'Tokyo'}, 'rating': {0: 'bad', 1: 'good', 2: 'good', 3: 'bad', 4: 'good'}, 'measure_score': {0: 0.9529434442520142, 1: 0.952498733997345, 2: 0.9080725312232971, 3: 0.8847543001174927, 4: 0.8679852485656738}}
dt = pd.DataFrame.from_dict(data)
dt.\
    set_index(['rating', 'measure_score']).\
    apply(lambda x: x.str.split(',').explode())

>Solution :

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

When you explode, the index are the same for (each) old rows. Pandas doesn’t know (or like) to align these indexes, because intention of users can be different from case to case, e.g. align by order, or cross merge. In your case, for example, what do you expect to get from row 1 where you have 2 entries for each column? How about row 2?

If you want a cross merge, you would need to explode manually:

def explode(x, col): return x.assign(**{col:x[col].str.split(', ')}).explode(col)

explode(explode(dt, 'fruit_tag'), 'location')

Output:

    fruit_tag    location rating  measure_score
0       apple  Hong Kong     bad       0.952943
0       apple      London    bad       0.952943
0     organge  Hong Kong     bad       0.952943
0     organge      London    bad       0.952943
1  watermelon    New York   good       0.952499
1  watermelon       Tokyo   good       0.952499
2      banana   Singapore   good       0.908073
3      banana   Singapore    bad       0.884754
3      banana   Hong Kong    bad       0.884754
4       apple       Tokyo   good       0.867985
4      banana       Tokyo   good       0.867985
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading