I have a df with many columns and I want to select all rows that have any 2 column-values nonzero. filtering on rows where specific columns are nonzero is trivial, I mean this is probably trivial as well but it’s eluding me. have/want dataframes for reproduceable example below:
import pandas as pd
have = pd.DataFrame({'ID': [1,2,3,4,5],
'grp1': [1,0,0,0,0],
'grp2': [0,2,0,0,0],
'grp3': [1,1,0,0,5]})
want = have.iloc[0:2]
>>> want
ID grp1 grp2 grp3
0 1 1 0 1
1 2 0 2 1
last note – I pass back and forth between pandas and sql(ite) a lot, so if there’s a more elegant way to do this in sql than pandas that I’m missing, I’m open to that too.
>Solution :
In pandas:
have[have.iloc[:,1:].gt(0).sum(1).gt(1)]
ID grp1 grp2 grp3
0 1 1 0 1
1 2 0 2 1