Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Pandas find multiple words from a list and assign Boolean value if found

So, I have dataframe like this,

data = {
  "properties": ["FinancialOffice","Gas Station", "Office", "K-12 School", "Commercial, Office"],
}
df = pd.DataFrame(data)

This is my list,

proplist = ["Office","Other - Mall","Gym"]

what I am trying to do is using the list I am trying to find out which words exactly matches with the dataframe column and for each word from the dataframe I need to assign a Boolean true/false value or 0/1. It has to be a exact match.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

Output like this,

properties         flag
FinancialOffice    FALSE
Gas Station        FALSE
Office             TRUE
K-12 School        FALSE
Commercial, Office TRUE

So, It returns TRUE for only "Office" because it is the exact match from the list. FinancialOffice is not because it is not in the list. Also, For the last one Commercial, Office it is TRUE because Office is found in the list even though Commercial not. So, even one of them is present it will be TRUE.

df["flag"] = df["properties"].isin(proplist)

Above code works fine to assign a boolean true/false but It returns FALSE for the last one(Commercial,Office) as it tries to find the exact match.

Any help is appreciated.

>Solution :

Use a crafted regex with word delimiter:

import re

regex = r'\b(?:%s)\b' % '|'.join(map(re.escape, proplist))
# '\\b(?:Office|Other\\ \\-\\ Mall|Gym)\\b'

df['flag'] = df['properties'].str.contains(regex, regex=True)
# for a case insensitive match add the case=False parameter

output:

           properties   flag
0     FinancialOffice  False
1         Gas Station  False
2              Office   True
3         K-12 School  False
4  Commercial, Office   True
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading