Home Is there a simple way to remove duplicate values in certain cells of a dataframe column?

Questions

Is there a simple way to remove duplicate values in certain cells of a dataframe column?

March 19, 2022

I have a dataframe column with city locations and some of the cells have the same value (city) twice within each cell. I was wondering how to get rid of one of the values. eg. Instead of it saying Dublin Dublin below it will only say Dublin once.

I have tried df['city'].apply(set) but it doesn’t give me what I am looking for.

Any advice much appreciated. Please see the image below:

>Solution :

You can split each item by (space) and convert each list of split strings to a set (which is deduplicated, but not sorted), and then re-join:

df['city'] = df['city'].str.split().apply(set).str.join(' ')

Output:

>>> df
             city
0  Los CA Angeles
1            none
2          London
3          Dublin

dataframe

byMR

Published March 19, 2022

Add a comment

Algorithm Question: How does linked list gets updated using Python class?

byMR

March 19, 2022

Questions

How to style echoed text?

byMR

March 19, 2022

Questions

Best way in python to check if a loop is not executed

byMR

March 19, 2022

Questions

Write the outcome of a for loop in multiple files in a single file in shell

byMR

March 19, 2022

Questions

Broadcasting vector to matrix to calculate sum (MATLAB style)

byMR

March 19, 2022

Questions

Why do type annotations make Python code slower?

byMR

March 19, 2022

Is there a simple way to remove duplicate values in certain cells of a dataframe column?

MEDevel.com: Open-source for Healthcare and Education

>Solution :

Like this:

Leave a ReplyCancel reply

Read more

Algorithm Question: How does linked list gets updated using Python class?

How to style echoed text?

Best way in python to check if a loop is not executed

Write the outcome of a for loop in multiple files in a single file in shell

Broadcasting vector to matrix to calculate sum (MATLAB style)

Why do type annotations make Python code slower?

Keep Up to Date with the Most Important News

Is there a simple way to remove duplicate values in certain cells of a dataframe column?

MEDevel.com: Open-source for Healthcare and Education

>Solution :

Share this:

Like this:

Leave a ReplyCancel reply

Keep Up to Date with the Most Important News

Read more

Algorithm Question: How does linked list gets updated using Python class?

How to style echoed text?

Best way in python to check if a loop is not executed

Write the outcome of a for loop in multiple files in a single file in shell

Broadcasting vector to matrix to calculate sum (MATLAB style)

Why do type annotations make Python code slower?

Discover more from Dev solutions