I’d like to filter my dataframe such that only rows that have a column containing a substring of another string are selected. I know that the opposite can be done like this:
selection = df[df.str.contains(substring)]
But how would I do it such that the substring is in the dataframe and it is compared to another string. What I’ve tried are
import pandas
a = pandas.DataFrame({"b":["foo","bar"]})
selection = a[a.b.str in "foot"] # should match first row
selection = a[a.b.str.isin("foot")] # should match first row
selection = a[a.b.str.isin("foobar")] # should match both rows
but these won’t work
>Solution :
You can do this.
import pandas as pd
a = pd.DataFrame({"b":["foo","bar"]})
selection = a[a.b.apply(lambda x: x in "foot")] # should match first row
selection = a[a.b.apply(lambda x: x in "foot")] # should match first row
selection = a[a.b.apply(lambda x: x in "foobar")] # should match both rows