Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Python/Pandas: Update a field in same row if another field in row contains criteria in string

I have a list of five car brands (BMW, Ford, Porsche, Skoda, Mazda) the data in my dataframe will only be from one of these five brands.

Model column will always contain the Brand but sometimes Brand column is not populated. Is there a way to update Brand based on the Model containing the string required. The model may also not always be at the start of the string.

My Dataframe:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

  Brand    Model
0 nan      Mazda CX-5
1 BMW      BMW X5
2 BMW      BMW 2 Series
3 Ford     Ford Focus
4 Porsche  Porsche Cayenne
5 Skoda    Skoda Octavia
6 nan      911 Porsche
7 nan      8 Series BMW

My Desired Dataframe:

  Brand    Model
0 Mazda    Mazda CX-5
1 BMW      BMW X5
2 BMW      BMW 2 Series
3 Ford     Ford Focus
4 Porsche  Porsche Cayenne
5 Skoda    Skoda Octavia
6 Porsche  911 Porsche
7 BMW      8 Series BMW

>Solution :

You can use a regex to extract the first (letters-only) word and fillna.

df['Brand'] = df['Brand'].fillna(df['Model'].str.extract('([a-zA-Z]+)',
                                                         expand=False))

output:

     Brand            Model
0    Mazda       Mazda CX-5
1      BMW           BMW X5
2      BMW     BMW 2 Series
3     Ford       Ford Focus
4  Porsche  Porsche Cayenne
5    Skoda    Skoda Octavia
6  Porsche      911 Porsche

NB. If the brand name is not always the first letters-only word, you need to give more details on how to identify it.

using a known list of brands
brands = ['BMW', 'Ford', 'Porsche', 'Skoda', 'Mazda']
regex = '(%s)' % '|'.join(brands)
df['Brand'] = df['Brand'].fillna(df['Model'].str.extract(regex, expand=False))
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading