Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

pandas df filter on rows with any n rows satisfying condition

I have a df with many columns and I want to select all rows that have any 2 column-values nonzero. filtering on rows where specific columns are nonzero is trivial, I mean this is probably trivial as well but it’s eluding me. have/want dataframes for reproduceable example below:

import pandas as pd

have = pd.DataFrame({'ID': [1,2,3,4,5],
        'grp1': [1,0,0,0,0],
        'grp2': [0,2,0,0,0],
        'grp3': [1,1,0,0,5]})

want = have.iloc[0:2]
>>> want
   ID  grp1  grp2  grp3
0   1     1     0     1
1   2     0     2     1

last note – I pass back and forth between pandas and sql(ite) a lot, so if there’s a more elegant way to do this in sql than pandas that I’m missing, I’m open to that too.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

>Solution :

In pandas:

have[have.iloc[:,1:].gt(0).sum(1).gt(1)]
 
   ID  grp1  grp2  grp3
0   1     1     0     1
1   2     0     2     1
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading