Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Python Pandas difference in boolean indexing between ~ != and ==

I am confused about different results of boolean indexing when using ~ after != versus when using just ==

I have a pandas df with 4 columns:

dic = {
    "a": [1,1,1,0,0,1,1],
    "b": [0,0,1,1,0,0,0],
    "c": [1,0,1,0,0,1,0],
    "d": [0,0,1,0,0,1,0],
}

df = pd.DataFrame(data=dic)

print(df)

   a  b  c  d
0  1  0  1  0
1  1  0  0  0
2  1  1  1  1
3  0  1  0  0
4  0  0  0  0
5  1  0  1  1
6  1  0  0  0

I want to subset the whole df dataframe:
I want to remove all rows which have all elements zero, but just on the columns b c d, and not on a.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

If I use ~ (not) operator after == I get the desired result:

names = ["b","c","d"]

df_A = df.loc[~(df[names] == 0.0).all(axis=1)]

print(df_A)

   a  b  c  d
0  1  0  1  0
2  1  1  1  1
3  0  1  0  0
5  1  0  1  1

But when I use just == I get different result:

names = ["b","c","d"]

df_B = df.loc[(df[names] != 0.0).all(axis=1)]

print(df_B)

   a  b  c  d
2  1  1  1  1

Do you have any idea why is this the case? Should these two not be the same?

Thank you.

>Solution :

You’re not correctly following De Morgan’s law.

  • not (A or B) = (not A) and (not B)
  • not (A and B) = (not A) or (not B)

If you use the opposite condition as input, you have to replace all (AND) by any (OR):

df_B = df.loc[(df[names] != 0.0).any(axis=1)]

In English this would be:

  • I want to REMOVE (~) rows for which ALL values are 0 (== 0)
  • I want to KEEP rows for which ANY value is NOT 0 (!= 0)

In short the rules for inverting input conditions are:

  • swapping strict/equal operators (e.g. >= becomes < ; > becomes <=)
  • AND become OR and OR becomes AND
  • adding/removing NOT (~)

Output:

   a  b  c  d
0  1  0  1  0
2  1  1  1  1
3  0  1  0  0
5  1  0  1  1
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading