Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

how to return matched keywords in the pandas str.contains using regex parameter?

This is my sample code:

import pandas as pd

df = pd.DataFrame({'A':
                       ['btcrr',
                        'You have crypto here',
                        'coinbase.com was there ',
                        'hotwalletint']
                   })

regex = r"(^|\W)(?:btc|crypto|coinbase|hotwallet)[^A-Za-z0-9]"
tagged_df = df[df['A'].str.contains(regex, na=False, regex=True, case=False)]

The output of tagged_df:

   A
1  You have crypto here
2  coinbase.com was there 

In this case, this will return only if it matches the regex that I gave. But I want the pandas to return the matched keyword. I am expecting something like this to return in tagged_df

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

The Expected output of tagged_df:

   A
1  crypto
2  coinbase.com

If pandas do not have the ability, Please suggest alternates that can solve this case.

>Solution :

Use pandas.Series.str.extract(). For each capture group in the regular expession (a non-capture group is just a group with ?: at the beginning, e.g. (?:abc)), a new colum will be created containing the matched value for that group, for that row. You can also Add ?P<your_name> to the very beginning of a capture group to name the outputted column associated with that group:

new_df = df['A'].str.extract(r'(?:^|\W)(?P<A>btc|crypto|coinbase|hotwallet)[^A-Za-z0-9]')

Output:

>>> new_df
          A
0       NaN
1    crypto
2  coinbase
3       NaN

>>> new_df.dropna()
          A
1    crypto
2  coinbase
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading