Extracting from an array of strings, strings that contain a substring in them (Python)

A question in Python (3.9.5) and Pandas:

Suppose I have an array of strings x and I want to extract all the elements that contains a certain substring, e.g. feb05. Is there a Pythonic way to do it in one-line, including using a Pandas functions?

Example for what I mean:

x = ["2023_jan05", "2023_jan_27", "2023_feb04", "2023_feb05", "2024_feb05"]
must_contain = "feb05"
desired_output = ["2023_feb05", "2024_feb05"]

I can run a loop,

import numpy as np
import pandas as pd

desired_output = []
indices_bool = np.zeros(len(x))
for idx, test in enumerate(x):
   if must_contain in test:
      indices_bool[idx] = 1

but I seek for a more Pythonic way to do it.

In my application x is a column in a Pandas dataframe, so answers with Pandas functions will also be welcomed. The goal is to filter all the rows that has must_contain in the field x (e.g. x = df["names"]).

>Solution :

Since you are with pandas, you can use str.contains to get the boolean condition:

import pandas as pd
df = pd.DataFrame({'x': ["2023_jan05", "2023_jan_27", "2023_feb04", "2023_feb05", "2024_feb05"]})
must_contain = "feb05"

#0    False
#1    False
#2    False
#3     True
#4     True
#Name: x, dtype: bool

Filter by the condition:

#            x
#3  2023_feb05
#4  2024_feb05

Leave a Reply