I am using str.contains in my dataframe to see if a certain value is inside the values of a Series.
Instead of the output being True or False, I want to see the actual value that I pass inside the contains.
A B
1 Fer
2 Ger
3 Tir
My expected output:
A B C
1 Fer er
2 Ger er
3 Tir Nan
Is there a built-in way to do this with pandas?
>Solution :
Series.str.extract is perfect for this:
df['C'] = df['B'].str.extract('(er)')
Output:
>>> df
A B C
0 1 Fer er
1 2 Ger er
2 3 Tir NaN
The parentheses in (er) are important; they signify a capture group. If the regular expression within them matches any text, that matched text will be copied into the output column. If the regular expression doesn’t match, NaN is copied to the output column. .str.extract returns a dataframe with one column per capture group, so (er)(abc)(def) would return a dataframe with 3 columns.