Home How can I filter a Pandas DataFrame based on whether all aggregated values in a column are True?

Questions

How can I filter a Pandas DataFrame based on whether all aggregated values in a column are True?

December 14, 2022

I have the following data

data = [
    [1, True],
    [1, True],
    [1, True],
    [1, True],
    [2, True],
    [2, False],
    [2, True],
    [3, True],
    [3, True],
    [3, True],
    [3, True],
    [4, True],
    [4, True],
    [4, False],
    [5, True],
    [5, True],
    [5, True],    
    [5, True],
]

df = pd.DataFrame(data, columns=['ids', 'accept'])

And I would like to filter out all rows whose IDs that have at least one False value in the accept column. So my result should look like (note the missing 2 and 4 IDs):

   ids accept
0   1   True
1   1   True
2   1   True
3   1   True
4   3   True
5   3   True
6   3   True
7   3   True
8   5   True
9   5   True
10  5   True
11  5   True

I was able to get a list of the IDs for which all values in the accept column are True using the groupby() and all() methods:

# Use the groupby() method to group the DataFrame by the 'ids' column
grouped = df.groupby('ids')

# Use the all() method to check whether all values in the 'accept' column are True for each group
accept_all_true = grouped['accept'].all()

But I am stuck at this point. How can I apply this grouping to my original data frame?

>Solution :

If performance is important you need avoid DataFrameGroupBy.filter because bad performance in larger DataFrame:

First idea is filter ids for at least one False and then fiter ouput this ids:

df1 = df[~df['ids'].isin(df.loc[~df['accept'],'ids'])]
print (df1)
    ids  accept
0     1    True
1     1    True
2     1    True
3     1    True
7     3    True
8     3    True
9     3    True
10    3    True
14    5    True
15    5    True
16    5    True
17    5    True

Or you can use GroupBy.transform for test if all Trues per groups, performance should be slowier:

df1 = df[df.groupby('ids')['accept'].transform('all').eq(df['accept'])]

dataframe

byMR

Published December 14, 2022

Add a comment

Operator '==' cannot be applied…(java)

byMR

December 14, 2022

Questions

Count the absolute number and the percentage of NAs for every column in a dataframe

byMR

December 14, 2022

Questions

Changing window.location.reload Time

byMR

December 14, 2022

Questions

Property 'xxx' does not exist on type 'string' when index signature has the property type

byMR

December 14, 2022

Questions

Export image file as module React TypeScript

byMR

December 14, 2022

Questions

Create Duration Variable In R

byMR

December 14, 2022

How can I filter a Pandas DataFrame based on whether all aggregated values in a column are True?

MEDevel.com: Open-source for Healthcare and Education

>Solution :

Like this:

Leave a ReplyCancel reply

Read more

Operator '==' cannot be applied…(java)

Count the absolute number and the percentage of NAs for every column in a dataframe

Changing window.location.reload Time

Property 'xxx' does not exist on type 'string' when index signature has the property type

Export image file as module React TypeScript

Create Duration Variable In R

Keep Up to Date with the Most Important News

How can I filter a Pandas DataFrame based on whether all aggregated values in a column are True?

MEDevel.com: Open-source for Healthcare and Education

>Solution :

Share this:

Like this:

Leave a ReplyCancel reply

Keep Up to Date with the Most Important News

Read more

Operator '==' cannot be applied…(java)

Count the absolute number and the percentage of NAs for every column in a dataframe

Changing window.location.reload Time

Property 'xxx' does not exist on type 'string' when index signature has the property type

Export image file as module React TypeScript

Create Duration Variable In R

Discover more from Dev solutions