Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How to filter rows and columns based on the maximum value in a Python DataFrame

Shown below are few details on a DataFrame.

enter image description here

Below is the syntax that is been used and do not get the expected output.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

df = df.sort_values(by=['country','Year','Value'], ascending=[True,True,False])
df = df.drop_duplicates('country')

how could I get the expected output shown below

enter image description here

>Solution :

Try sorting by "Value" and keeping the last row for each country

>>> df.sort_values("Value").drop_duplicates("country",keep="last")
    Year country  Value
2   2003     USA   7000
6   2002   India   9000
10  2001   Japan  10000

Alternatively, you could use groupby:

>>> df[df["Value"].eq(df.groupby("country")["Value"].transform('max'))]
    Year country  Value
2   2003     USA   7000
6   2002   India   9000
10  2001   Japan  10000
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading