I have a dataframe that looks like this:
| Items | notebook | ballpoint | pencil | eraser | pencil sharpener | stapler | paper | scissors | glue |
|---|---|---|---|---|---|---|---|---|---|
| image1 | 1 | 0 | 1 | 1 | 0 | 0 | 0 | 0 | 0 |
| image2 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 |
| image3 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 |
| image4 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 |
| image5 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 |
I want to delete rows that have multiple 1 in different columns, so it become like this:
| Items | notebook | ballpoint | pencil | eraser | pencil sharpener | stapler | paper | scissors | glue |
|---|---|---|---|---|---|---|---|---|---|
| image3 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 |
| image4 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 |
| image5 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 |
>Solution :
You can use boolean indexing with the sum of matches or values (if only 0/1) as reference:
out = df[df.drop(columns='Items').sum(axis=1).lt(2)]
Or:
out = df[df.eq(1).sum(axis=1).lt(2)]
Output:
Items notebook ballpoint pencil eraser pencil.1 sharpener stapler paper scissors glue
2 image3 0 0 0 0 1 0 0 0 0 NaN
3 image4 0 0 0 0 0 1 0 0 0 NaN
4 image5 0 0 0 0 0 0 0 1 0 NaN
Intermediate indexing Series:
df.drop(columns='Items').sum(axis=1).lt(2)
# or
# df.eq(1).sum(axis=1).lt(2)
0 False
1 False
2 True
3 True
4 True
dtype: bool