Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Searching for strings in lists inside Pandas DataFrame

I’m trying to search for strings within lists that are contained in a pandas dataframe, see this one example:

       userAuthor     hashtagsMessage
post_1    nytimes            [#Emmys]
post_2        TMZ                  []
post_3     Forbes        [#BTSatUNGA]
post_4    nytimes            [#Emmys]
post_5     Forbes  [#BTS, #BTSatUNGA]

As you have noticed, the column that hosts such lists is ‘hashtagsMessage’. I’ve tried using conventional methods for string searching but I’ve not been able to.

If I wanted to look for an exact match for ‘#BTS’, with a conventional method, you could use some of these options, like:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

df['hashtagsMessage'].str.contains("#BTS", case=False)

or

df['hashtagsMessage']=="#BTS" 

Or similar. Unfortunately, these approaches do not work for lists, I have to make an extra step I suppose to index inside the list while I’m searching in the DataFrame but I’m not really sure how to do this part.

Any help is entirely appreciated!

>Solution :

Use map or apply:

>>> df['hashtagsMessage'].map(lambda x: '#BTS' in x)

post_1    False
post_2    False
post_3    False
post_4    False
post_5     True
Name: hashtagsMessage, dtype: bool

Update

A more vectorizable way using explode:

>>> df.loc[df['hashtagsMessage'].explode().eq('#BTS').loc[lambda x: x].index]

       userAuthor     hashtagsMessage
post_5     Forbes  [#BTS, #BTSatUNGA]
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading