I have a dataframe that looks like this
Title Description
Area 51 Aliens come to earth on the 4th of July.
Matrix Hacker Neo discovers the shocking truth.
Spaceballs A star-pilot for hire and his trusty sidekick must come to the rescue of a princess.
I am want to select rows that contain the word Space or Aliens in either the title or description.
I can select rows that contain space using a single column but I am unsure of how to include the second column.
words_of_interest = ["Space", "Aliens"]
df[df["Title"].str.contains("|".join(words_of_interest))]
Title Description
Area 51 Aliens come to earth on the 4th of July.
Spaceballs A star-pilot for hire and his trusty sidekick must come to the rescue of a
>Solution :
You can apply str.contains on both columns then aggregate boolean mask with any(axis=1):
words_of_interest = ["Space", "Aliens"]
pat = '|'.join(words_of_interest)
mask = df[['Title', 'Description']].apply(lambda x: x.str.contains(pat)).any(axis=1)
Output:
>>> df[mask]
Title Description
0 Area 51 Aliens come to earth on the 4th of July.
2 Spaceballs A star-pilot for hire and his trusty sidekick ...