Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Pandas groupby a column and sort by date and get only the latest row

I need to group by user_id alone and display content_id which is of the latest date.
When I tried like below, one of the user is getting correct values but another user (123) is getting wrong value. Expected is content_id – 20 but 21 is coming.
Please help how to resolve this.

Sample data:

data = {'user_id':['123', '123', '234', '234'],
        'content_id':[20, 21, 19, 18],
        'date':['2020-10-14', '2019-01-25','2021-05-26','2018-03-26']}
 

df = pd.DataFrame(data)
df


    user_id content_id  date
0   123     20          2020-10-14
1   123     21          2019-01-25
2   234     19          2021-05-26
3   234     18          2018-03-26


df.sort_values(['user_id','content_id','date']).groupby(['user_id'])['content_id','date'].last()


user_id content_id  date
        
123     21         2019-01-25
234     19         2021-05-26

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

>Solution :

If date has higher precendence than content_id, use that fact in sort_values:

out = df.sort_values(['user_id','date','content_id']).groupby(['user_id'])[['content_id','date']].last()

Another possibility is to convert date to datetime and the find the latest date’s index using groupby + idxmax; then use loc to filter the desired output:

df['date'] = pd.to_datetime(df['date'])
out = df.loc[df.groupby('user_id')['date'].idxmax()]

Output:

         content_id        date
user_id                        
123              20  2020-10-14
234              19  2021-05-26
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading