Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Identify instances where string exists more than once in a row + Python, Pandas, Dataframe

I’m want to write a script that will identify instances where a word (string) appears in a row of a pandas dataframe more than once.

Using a lambda function I can identify the existence of a string in a row but but I can’t find any information on how to identify ‘2 or more’ instances of the string, this is an example of what I have currently:

df = pd.DataFrame({'ID':[1,2,3],'Ans1':['Yes','Yes','Yes'],'Ans2':['No','Yes','No'],'Ans3':['No','No','No']})
df['Result'] = df.apply(lambda row: row.astype(str).str.contains('Yes').any(), axis=1)

df

Pseudocode for what I’m trying to get:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

if 'Yes' isin row > 1:
   df['Results'] == True

Desired result:

ID  Ans1    Ans2    Ans3    Result
1   Yes     No      No      False
2   Yes     Yes     No      True
3   Yes     No      No      False

>Solution :

Try, you can do column filtering if you don’t want to check the entire dataframe for yes, then use eq, equals to, and sum with axis=1 to sum values along rows then check to see if that sum is gt, greater than, 1:

df['Result'] = df.eq('Yes').sum(1).gt(1)

Output:

   ID Ans1 Ans2 Ans3  Result
0   1  Yes   No   No   False
1   2  Yes  Yes   No    True
2   3  Yes   No   No   False
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading