Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How to rename Pandas columns based on mapping?

I have a dataframe where column Name contains values such as the following (the rest of the columns do not affect how this question is answered I hope):

Chicken
Chickens
Fluffy Chicken
Whale
Whales
Blue Whale
Tiger
White Tiger
Big Tiger

Now, I want to ensure that we rename these entries to be like the following:

Chicken
Chicken
Chicken
Whale
Whale
Whale
Tiger
Tiger
Tiger

Essentially substituting anything that has ‘Chicken’ to just be ‘Chicken, anything with ‘Whale’ to be just ‘Whale, and anything with ‘Tiger’ to be just ‘Tiger’.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

What is the best way to do this? There are almost 1 million rows in the dataframe.

Sorry just to add, I have a list of what we expect i.e.

['Chicken', 'Whale', 'Tiger']

They can appear in any order in the column

What I should also add is, the column might contain things like "Mushroom" or "Eggs" that do not need substituting from the original list.

>Solution :

Try with str.extract

#l = ['Chicken', 'Whale', 'Tiger']

df['new'] = df['col'].str.extract('('+'|'.join(l)+')')[0]
Out[10]: 
0    Chicken
1    Chicken
2    Chicken
3      Whale
4      Whale
5      Whale
6      Tiger
7      Tiger
8      Tiger
Name: 0, dtype: object
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading