Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

search word in list all pandas columns

Below is my DF

df = pd.DataFrame({'a' : ['NYC', 'NYC', 'Boston', 'LA', 'SF', 'NYC'], 'b' : ['Other', 'Other', 'NY', 'NUI', 'SD', 'SF']})

    a   b
0   NYC Other
1   NYC Other
2   Boston  NY
3   LA  NUI
4   SF  SD
5   NYC SF

The aim is to check if list of words is in the df

Below is the code to check for a specific word

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

word = 'SF'
mask = np.column_stack([df[col].str.contains(word, na=False) for col in df])
df.loc[mask.any(axis=1)]


a   b
4   SF  SD
5   NYC SF

How can this be performed with list and not one string ?

word = ['SF', 'NY']

>Solution :

If you want to match exact words, use isin combined with any:

word = ['SF', 'NY']

df[df.isin(word).any(1)]

output:

        a   b
2  Boston  NY
4      SF  SD
5     NYC  SF

intermediates:

df.isin(word)

       a      b
0  False  False
1  False  False
2  False   True
3  False  False
4   True  False
5  False   True

df.isin(word).any(1)

0    False
1    False
2     True
3    False
4     True
5     True
dtype: bool

For a regex match combine apply and str.contains:

word = ['SF', 'NY']
regex = '|'.join(word)
df[df.apply(lambda c: c.str.contains(regex)).any(1)]

output:

        a      b
0     NYC  Other
1     NYC  Other
2  Boston     NY
4      SF     SD
5     NYC     SF
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading