Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Duplicate substring removal from list

I have a dataframe with a product_type column that has duplicate substrings within strings:

df1

product_type
bag,bag
tote bag,bag

handbag,handbag

I’m using this line to remove to create a new column "unique_type" the duplicate substrings

df_1['unique_type'] = [set(sub.split(',')) for sub in df_1["product_type"]]

This is what the new dataframe looks like

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

current output

product_type         unique_type
bag,bag              {'bag'}
tote bag, bag        {'tote bag', 'bag'}
                     {''}
handbag, handbag     {'handbag'}

The problem is that the strings in the new column unique_type has curly brackets and quotation marks. I would like to produce a column that has strings without curly brackets and quotation marks like so:

desired output

product_type         unique_type
bag,bag              bag
tote bag, bag        tote bag, bag
                 
handbag, handbag     handbag

>Solution :

Add join:

df_1['unique_type'] = [', '.join(set(sub.split(','))) for sub in df_1["product_type"]]

Or if need same order of values use dict.fromkeys trick:

df_1['unique_type1'] = [', '.join(dict.fromkeys(sub.split(',')))
                                                     for sub in df_1["product_type"]]


print (df_1)
      product_type    unique_type   unique_type1
0          bag,bag            bag            bag
1     tote bag,bag  bag, tote bag  tote bag, bag
2                                               
3  handbag,handbag        handbag        handbag
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading