Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How to find the last row in a dataframe that contains a specific value in a specific column?

I am looking for a Python function that will allow me to retrieve the information in the ‘date’ column for the last row in my dataframe for each person in my dataframe. This is because I need to know the last date that each person in the dataframe entered data.

I have tried split the dataframe by person, then use the tail() function to find the information for all columns in the last row, then grab the date, however this does not work for a dataframe of a large size containing many people.

   name   score    date
1  Mary   2        22-Feb-2022
2  Mary   1        16-Mar-2022
5  John   2        18-Dec-2022
6  Mary   3        01-Jan-2023 

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

>Solution :

A possible solution:

df.groupby('name')['date'].last()

Output:

name
John   2022-12-18
Mary   2023-01-01
Name: date, dtype: datetime64[ns]

If you want to add the last date to the dataframe:

df['last_date'] = df.groupby('name')['date'].transform('last')

Output:

   name  score       date  last_date
1  Mary      2 2022-02-22 2023-01-01
2  Mary      1 2022-03-16 2023-01-01
5  John      2 2022-12-18 2022-12-18
6  Mary      3 2023-01-01 2023-01-01
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading