Home Filter rows from Pandas dataframe based on max value of a column

Questions

Filter rows from Pandas dataframe based on max value of a column

January 18, 2023

I have a pandas dataframe with following columns

col1   col2   col3
x      12     abc
x       7     abc
x       5     abc
x       3   
y      10     abc
y       9     abc

I would like to find all rows in a pandas DataFrame which have the max value for col2 column, after grouping by ‘col1’ columns after filtering the rows where col3 is null?

The expected output is:

col1   col2   col3
x      12     abc
y      10     abc

I have tried the below code so far.

df[df[['col3']].notnull().all(1) & df.sort_values('col2').drop_duplicates(['col1'], keep='last')]

However I am getting following error.

TypeError: unsupported operand type(s) for &: 'bool' and 'float'

Any help is highly appreciated

>Solution :

How max method calculate without mentioning the column?

According to the pd.DataFrame.max it returns the maximum values over the selected axis with default being (0, index).

In your example you only have 1 numeric values and all values in col3 are identical. If col3 was also numeric, max method would return the max values of that column as well with the resulting DataFrame may have different rows than the original one.

It is suitable in this case but if you only would like the output DataFrame have the rows as the original one you need to be specific about the column whose maximum you would like to consider.

df.loc[df.notnull().all(axis=1)].groupby('col1').max().reset_index()

  col1  col2 col3
0    x    12  abc
1    y    10  abc

Or you can create a boolean Series first and assign it to a name to improve readability:

m = df.notnull().all(axis=1)
df.loc[m].groupby('col1').max().reset_index()

Let’s say now this is your original DataFrame:

  col1  col2  col3
0    x    12   2.0
1    x     7  20.0
2    x     5   1.0
3    x     3   NaN
4    y    10   4.0
5    y     9  11.0

When you apply max on this without specifying the column name it will return the following:

  col1  col2  col3
0    x    12  20.0
1    y    10  11.0

pandas

byMR

Published January 18, 2023

Add a comment

Turning single values and arrays into arrays including zeros as single value

byMR

January 18, 2023

Questions

Turning single values and arrays into arrays including zeros as single value

byMR

January 18, 2023

Questions

SwidtUI iOS – Cannot use instance member x within property initializer; property initializers run before 'self' is available

byMR

January 18, 2023

Questions

How to correctly return nested types in haskell?

byMR

January 18, 2023

Questions

Sed wont replace variable created inside for loop

byMR

January 18, 2023

Questions

SyntaxError: Cannot use import statement outside a module in javascript

byMR

January 18, 2023

Filter rows from Pandas dataframe based on max value of a column

MEDevel.com: Open-source for Healthcare and Education

>Solution :

Like this:

Leave a ReplyCancel reply

Read more

Turning single values and arrays into arrays including zeros as single value

Turning single values and arrays into arrays including zeros as single value

SwidtUI iOS – Cannot use instance member x within property initializer; property initializers run before 'self' is available

How to correctly return nested types in haskell?

Sed wont replace variable created inside for loop

SyntaxError: Cannot use import statement outside a module in javascript

Keep Up to Date with the Most Important News

Filter rows from Pandas dataframe based on max value of a column

MEDevel.com: Open-source for Healthcare and Education

>Solution :

Share this:

Like this:

Leave a ReplyCancel reply

Keep Up to Date with the Most Important News

Read more

Turning single values and arrays into arrays including zeros as single value

Turning single values and arrays into arrays including zeros as single value

SwidtUI iOS – Cannot use instance member x within property initializer; property initializers run before 'self' is available

How to correctly return nested types in haskell?

Sed wont replace variable created inside for loop

SyntaxError: Cannot use import statement outside a module in javascript

Discover more from Dev solutions