Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How to delete duplicated elements in columns of csv

I need help with deleting duplicated elements language columns that appears more than one time using python.

Here is my csv:

f = pd.DataFrame({'Movie': ['name1','name2','name3','name4'],
                  'Year': ['1905', '1905','1906','1907'],
                  'Id': ['tt0283985', 'tt0283986','tt0284043','tt3402904'],
                  'language':['Mandarin,Mandarin','Mandarin,Cantonese,Mandarin','Mandarin,Cantonese','Cantonese,Cantonese']})

Where f now looks like:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

   Movie  Year         Id   language
0  name1  1905  tt0283985  Mandarin,Mandarin
1  name2  1905  tt0283986  Mandarin,Cantonese,Mandarin
2  name3  1906  tt0284043  Mandarin,Cantonese
3  name4  1907  tt3402904  Cantonese,Cantonese

And the result should be like this:

   Movie  Year         Id             language
0  name1  1905  tt0283985            Mandarin
1  name2  1905  tt0283986            Mandarin,Cantonese
2  name3  1906  tt0284043            Mandarin,Cantonese
3  name4  1907  tt3402904            Cantonese

I am having trouble with writing a function to delete complicated values in language columns.
Thanks in advance!

>Solution :

Try this:

f['language'].str.split(',').map(lambda x: ','.join(set(x)))

Output:

0              Mandarin
1    Mandarin,Cantonese
2    Mandarin,Cantonese
3             Cantonese
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading