Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Assigning True/False if a token is present in a data-frame

My current data-frame is:

     |articleID | keywords                                               | 
     |:-------- |:------------------------------------------------------:| 
0    |58b61d1d  | ['Second Avenue (Manhattan, NY)']                      |     
1    |58b6393b  | ['Crossword Puzzles']                                  |          
2    |58b6556e  | ['Workplace Hazards and Violations', 'Trump, Donald J']|            
3    |58b657fa  | ['Trump, Donald J', 'Speeches and Statements'].        |  

I want a data-frame similar to the following, where a column is added based on whether a Trump token, ‘Trump, Donald J’ is mentioned in the keywords and if so then it is assigned True :

     |articleID | keywords                                               | trumpMention |
     |:-------- |:------------------------------------------------------:| ------------:|
0    |58b61d1d  | ['Second Avenue (Manhattan, NY)']                      | False        |      
1    |58b6393b  | ['Crossword Puzzles']                                  | False        |          
2    |58b6556e  | ['Workplace Hazards and Violations', 'Trump, Donald J']| True         |           
3    |58b657fa  | ['Trump, Donald J', 'Speeches and Statements'].        | True         |       

I have tried multiple ways using df functions. But cannot achieve my wanted results. Some of the ways I’ve tried are:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

df['trumpMention'] = np.where(any(df['keywords']) == 'Trump, Donald J', True, False) 

or

df['trumpMention'] = df['keywords'].apply(lambda x: any(token == 'Trump, Donald J') for token in x) 

or

lst = ['Trump, Donald J']  
df['trumpMention'] = df['keywords'].apply(lambda x: ([ True for token in x if any(token in lst)]))   

Raw input:

df = pd.DataFrame({'articleID': ['58b61d1d', '58b6393b', '58b6556e', '58b657fa'],
                   'keywords': [['Second Avenue (Manhattan, NY)'],
                                ['Crossword Puzzles'],
                                ['Workplace Hazards and Violations', 'Trump, Donald J'],
                                ['Trump, Donald J', 'Speeches and Statements']],
                   'trumpMention': [False, False, True, True]})

>Solution :

try

df["trumpMention"] = df["keywords"].apply(lambda x: "Trump, Donald J" in x)
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading