Home Python Pandas difference in boolean indexing between ~ != and ==

Questions

Python Pandas difference in boolean indexing between ~ != and ==

July 26, 2024

I am confused about different results of boolean indexing when using ~ after != versus when using just ==

I have a pandas df with 4 columns:

dic = {
    "a": [1,1,1,0,0,1,1],
    "b": [0,0,1,1,0,0,0],
    "c": [1,0,1,0,0,1,0],
    "d": [0,0,1,0,0,1,0],
}

df = pd.DataFrame(data=dic)

print(df)

   a  b  c  d
0  1  0  1  0
1  1  0  0  0
2  1  1  1  1
3  0  1  0  0
4  0  0  0  0
5  1  0  1  1
6  1  0  0  0

I want to subset the whole df dataframe:
I want to remove all rows which have all elements zero, but just on the columns b c d, and not on a.

If I use ~ (not) operator after == I get the desired result:

names = ["b","c","d"]

df_A = df.loc[~(df[names] == 0.0).all(axis=1)]

print(df_A)

   a  b  c  d
0  1  0  1  0
2  1  1  1  1
3  0  1  0  0
5  1  0  1  1

But when I use just == I get different result:

names = ["b","c","d"]

df_B = df.loc[(df[names] != 0.0).all(axis=1)]

print(df_B)

   a  b  c  d
2  1  1  1  1

Do you have any idea why is this the case? Should these two not be the same?

Thank you.

>Solution :

You’re not correctly following De Morgan’s law.

not (A or B) = (not A) and (not B)
not (A and B) = (not A) or (not B)

If you use the opposite condition as input, you have to replace all (AND) by any (OR):

df_B = df.loc[(df[names] != 0.0).any(axis=1)]

In English this would be:

I want to REMOVE (~) rows for which ALL values are 0 (== 0)
I want to KEEP rows for which ANY value is NOT 0 (!= 0)

In short the rules for inverting input conditions are:

swapping strict/equal operators (e.g. >= becomes < ; > becomes <=)
AND become OR and OR becomes AND
adding/removing NOT (~)

Output:

   a  b  c  d
0  1  0  1  0
2  1  1  1  1
3  0  1  0  0
5  1  0  1  1

boolean

byMR

Published July 26, 2024

Add a comment

How to return number ranking based on criteria?

byMR

July 26, 2024

Questions

How can I convert a float value to datetime with higher precision in polars?

byMR

July 26, 2024

Questions

How to join edges to same point on node?

byMR

July 26, 2024

Questions

Why does this link list not work in HTML?

byMR

July 26, 2024

Questions

TypeError: the first argument must be callable – self.job_func = functools.partial(job_func, *args, **kwargs)

byMR

July 26, 2024

Python Pandas difference in boolean indexing between ~ != and ==

MEDevel.com: Open-source for Healthcare and Education

>Solution :

Like this:

Leave a ReplyCancel reply

Read more

How to return number ranking based on criteria?

How can I convert a float value to datetime with higher precision in polars?

How to join edges to same point on node?

Why does this link list not work in HTML?

TypeError: the first argument must be callable – self.job_func = functools.partial(job_func, *args, **kwargs)

Keep Up to Date with the Most Important News

Python Pandas difference in boolean indexing between ~ != and ==

MEDevel.com: Open-source for Healthcare and Education

>Solution :

Share this:

Like this:

Leave a ReplyCancel reply

Keep Up to Date with the Most Important News

Read more

How to return number ranking based on criteria?

How to call a number from a multi-dimensional complex list

How can I convert a float value to datetime with higher precision in polars?

How to join edges to same point on node?

Why does this link list not work in HTML?

TypeError: the first argument must be callable – self.job_func = functools.partial(job_func, *args, **kwargs)

Discover more from Dev solutions