Home How to replace string values of a Pandas column with a string except some values?

Questions

How to replace string values of a Pandas column with a string except some values?

August 23, 2022

Example DataFrame:

import pandas as pd
df = pd.DataFrame({'Age' : [70.0, 58.0, 44.0, 40.0, 21.0, 35.0, 12.0, 43.0, 45.0, 65.0, 56.0, 31.0, 30.0,
                            52.0, 59.0, 52.0, 31.0, 55.0, 42.0, 73.0],
                   'MarketSegment' : ['Travel Agent/Operator', 'Other', 'Other', 'Other', 'Other',
                                      'Direct', 'Groups', 'Other', 'Other', 'Direct', 'Other',
                                      'Other', 'Other', 'Other', 'Groups', 'Groups', 'Other', 'Other',
                                      'Groups', 'Other'],
                   'Nationality' : ['CAN', 'ESP', 'FRA', 'DEU', 'GBR', 'RUS', 'IRL', 'FRA', 'IRL',
                                    'BRA', 'LTU', 'CHE', 'FRA', 'GBR', 'FRA', 'PRT', 'DEU', 'ESP',
                                    'CHE', 'USA']})

First, I only want the top 3 most common nationalities. I used the code below:

top_nat = df.groupby('Nationality').count().sort_values \
(by='Age', ascending = False).head(3).iloc[:, 0].index.to_list()

(Is there any way to do it using only the frequency of unique values in the ‘Nationalities’ column? And by not using any other column, like ‘Age’?)

Now I want all the values in ‘Nationalities’ to be replaced by ‘OTR’ except the values == those in top_nat. I tried stuff like these:

df['Nationality'].replace(~top_nat,'OTR', inplace=True)

df["Nationality"] = df["Nationality"].apply(lambda x: x.replace(~top_nat, "OTR"))

for x in top_nat:
    df.loc[df['Nationality'] != x, 'Nationality'] = 'OTR'

Nothing’s working. Maybe I want something like:

if values in df.Nationality != values in top_nat:
    replace that value in df.Nationality with 'OTR'
else:
    continue

Shape of original dataset is (82580, 30) and I need top 15 nationalities. Please help.

>Solution :

First get the top 3:

top = df['Nationality'].value_counts().nlargest(3).index

then set the nationality

df.loc[~df['Nationality'].isin(top), 'Nationality'] = 'OTR'

This leaves the top 3 nationalities as they are and replaces everything else to ‘OTR’

data-cleaning

byMR

Published August 23, 2022

Add a comment

How can I extract and pair the values of an array based object

byMR

August 23, 2022

Questions

Pandas : Convert 1D dataframe to 2D dataframe

byMR

August 23, 2022

Questions

Angular: ng-content doesn't render with *ngIf then else

byMR

August 23, 2022

Questions

plotting n number of equal points in circular direction in python

byMR

August 23, 2022

Questions

How to detect whether the element is the first child with the given class

byMR

August 23, 2022

Questions

"Skip" value is giving error , return negetive value by default

byMR

August 23, 2022

How to replace string values of a Pandas column with a string except some values?

MEDevel.com: Open-source for Healthcare and Education

>Solution :

Like this:

Leave a ReplyCancel reply

Read more

How can I extract and pair the values of an array based object

Pandas : Convert 1D dataframe to 2D dataframe

Angular: ng-content doesn't render with *ngIf then else

plotting n number of equal points in circular direction in python

How to detect whether the element is the first child with the given class

"Skip" value is giving error , return negetive value by default

Keep Up to Date with the Most Important News

How to replace string values of a Pandas column with a string except some values?

MEDevel.com: Open-source for Healthcare and Education

>Solution :

Share this:

Like this:

Leave a ReplyCancel reply

Keep Up to Date with the Most Important News

Read more

How can I extract and pair the values of an array based object

Pandas : Convert 1D dataframe to 2D dataframe

Angular: ng-content doesn't render with *ngIf then else

plotting n number of equal points in circular direction in python

How to detect whether the element is the first child with the given class

"Skip" value is giving error , return negetive value by default

Discover more from Dev solutions