I’m attempting to filter values out of a Pandas dataframe, but have Googled and ChatGPT’d to no avail:
why does
x1 = df[df!=True]
x2 = df[df==True]
result in 2 dataframes, each with the same shape as the original? How can I filter this dataframe into the parts that are True and those that are not.?
Ultimately, I want to do this filtering on a dataframe with several columns, so what I really want to do is more lke:
x1 = df[df['col1']!=True]
x2 = df[df['col1']==True]
>Solution :
Take an example like this:
df = pd.DataFrame({'A': [1, 2, 3], 'B': [4, 5, 6]})
>>> df
A B
0 1 4
1 2 5
2 3 6
# df!=True gets the values which are equal to True (it matches on 1 because 1 is equivalent to True (and False equivalent to 0) when using "=="
>>> df!=True
A B
0 False True
1 True True
2 True True
# df[df!=True] gets the values which are True in df!=True, and doesn't get anything for the ones which are false (hence A0 being NaN)
>>> df[df!=True]
A B
0 NaN 4
1 2.0 5
2 3.0 6
# this is similar to example #1, but only works on column 'A'
>>> df['A']!=True
0 False
1 True
2 True
Name: A, dtype: bool
# pandas interpets this as you wanting to filter the rows, so it takes the values from the example above, and filters the rows to the ones which return True
>>> df[df['A']!=True]
A B
1 2 5
2 3 6
Are you wanting to filter by column, by row, or by something else?