Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Pandas select rows where value is in either of two columns

I have a dataframe that looks like this

Title        Description
Area 51      Aliens come to earth on the 4th of July.
Matrix       Hacker Neo discovers the shocking truth.
Spaceballs   A star-pilot for hire and his trusty sidekick must come to the rescue of a princess.

I am want to select rows that contain the word Space or Aliens in either the title or description.

I can select rows that contain space using a single column but I am unsure of how to include the second column.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

   words_of_interest = ["Space", "Aliens"]
   df[df["Title"].str.contains("|".join(words_of_interest))]


   Title        Description
   Area 51      Aliens come to earth on the 4th of July.
   Spaceballs   A star-pilot for hire and his trusty sidekick must come to the rescue of a 

>Solution :

You can apply str.contains on both columns then aggregate boolean mask with any(axis=1):

words_of_interest = ["Space", "Aliens"]
pat = '|'.join(words_of_interest)
mask = df[['Title', 'Description']].apply(lambda x: x.str.contains(pat)).any(axis=1)

Output:

>>> df[mask]
        Title                                        Description
0     Area 51           Aliens come to earth on the 4th of July.
2  Spaceballs  A star-pilot for hire and his trusty sidekick ...
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading