Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How to groupby and resample data in pandas?

I have sales data for different customers on different dates. But the dates are not continuous and I would like to resample the data to daily frequency. How can I do this?

MWE

import numpy as np
import pandas as pd

df = pd.DataFrame({'id': list('aababcbc'),
                  'date': pd.date_range('2022-01-01',periods=8),
                  'value':range(8)}).sort_values('id')


df

id  date    value
0   a   2022-01-01  0
1   a   2022-01-02  1
3   a   2022-01-04  3
2   b   2022-01-03  2
4   b   2022-01-05  4
6   b   2022-01-07  6
5   c   2022-01-06  5
7   c   2022-01-08  7

The required output is following

id  date    value  
a   2022-01-01  0  
a   2022-01-02  1  
a   2022-01-03  0 ** there is no data for a in this day  
a   2022-01-04  3

  
b   2022-01-03  2    
b   2022-01-04  0 ** there is no data for b in this day  
b   2022-01-05  4  
b   2022-01-06  0 ** there is no data for b in this day  
b   2022-01-07  6

  
c   2022-01-06  5  
c   2022-01-07  0 ** there is no data for c in this day
c   2022-01-08  7

My attempt

df.groupby(['id']).resample('D',on='date')['value'].sum().reset_index()

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

>Solution :

df["date"] = pd.to_datetime(df["date"])
df.set_index("date").groupby("id").resample("1d").sum()
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading