Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How to get timestamp differences per group?

I have a column date in my dataset and column group.i want to get parameter differrence between min and max date in "date" column per group. how to do that? here i example of my data:

group      date
main      2024-01-01
main      2024-01-03
main      2024-01-05
second    2024-02-05
second    2024-02-20

desire result:

group      date_diff
main          4
second        15

i tried to do groupby and then i dont know how to get differences

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

>Solution :

Get minimal and maximal datetimes per groups by GroupBy.agg, then get difference by subtract and converting days to numbers by Series.dt.days:

df['date'] = pd.to_datetime(df['date'])

out = (df.groupby('group')['date'].agg(['min','max'])
         .assign(date_diff=lambda x: x['max'].sub(x['min']).dt.days)['date_diff']
         .reset_index())

print (out)
    group  date_diff
0    main          4
1  second         15

Or use lambda function, if large DataFrame, should be slowier:

out = (df.groupby('group')['date'].agg(lambda x: (x.max() - x.min()).days)
         .reset_index(name='date_diff'))
print (out)
    group  date_diff
0    main          4
1  second         15
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading