Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

select pandas rows based on 2 conditions using or

I have a pandas dataframe filled with values between 0 and 3 and now I would like to either select the max value and return the column label or return ‘undecided’ if a row does only contain 0 and 1 across multiple columns.

so far I have the first part sorted:

df['result'] = df.idxmax(axis=1) or 

How do I write the second part?

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

My dataframe looks like this and I want to add a column ‘result’ to it:

Column0 | Column1 | Column2 | Column3 | Column4 | result
   0         0         1         2        0       Column3
   3         1         0         0        0       Column0 
   0         1         0         1        0       undecided

>Solution :

Use mask to hide rows that do not respect the constraint:

df['result'] = df.mask(df.lt(2).all(1)).idxmax(1).fillna('undecided')
print(df)

# Output
   Column0  Column1  Column2  Column3  Column4     result
0        0        0        1        2        0    Column3
1        3        1        0        0        0    Column0
2        0        1        0        1        0  undecided

Output of mask:

>>> df.mask(df.lt(2).all(1))
   Column0  Column1  Column2  Column3  Column4
0      0.0      0.0      1.0      2.0      0.0
1      3.0      1.0      0.0      0.0      0.0
2      NaN      NaN      NaN      NaN      NaN
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading