Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

.str.contains returning actual found value instead of True or False

I am using str.contains in my dataframe to see if a certain value is inside the values of a Series.

Instead of the output being True or False, I want to see the actual value that I pass inside the contains.

A     B
1   Fer
2   Ger
3   Tir    

My expected output:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

A     B    C
1   Fer   er
2   Ger   er
3   Tir  Nan 

Is there a built-in way to do this with pandas?

>Solution :

Series.str.extract is perfect for this:

df['C'] = df['B'].str.extract('(er)')

Output:

>>> df
   A    B    C
0  1  Fer   er
1  2  Ger   er
2  3  Tir  NaN

The parentheses in (er) are important; they signify a capture group. If the regular expression within them matches any text, that matched text will be copied into the output column. If the regular expression doesn’t match, NaN is copied to the output column. .str.extract returns a dataframe with one column per capture group, so (er)(abc)(def) would return a dataframe with 3 columns.

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading