Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Python Pandas Period Strings does not work on minutes

my df is like this:

                   timestamp       power
0        2022-01-01 00:00:00  100.000000
1        2022-01-01 00:00:01  100.004526
2        2022-01-01 00:00:02  100.009053
3        2022-01-01 00:00:03  100.013579
4        2022-01-01 00:00:04  100.018105
...                      ...         ...
31535995 2022-12-31 23:59:55  136.750000
31535996 2022-12-31 23:59:56  136.560000
31535997 2022-12-31 23:59:57  136.440000
31535998 2022-12-31 23:59:58  136.380000
31535999 2022-12-31 23:59:59  136.530000

[31536000 rows x 2 columns]

I have a super simple script:

directory = 'data/peak_shaving/20220803_132445'
df = pd.read_csv(f'{directory}/demand_profile_simulation.csv')
df['timestamp'] = pd.to_datetime(df['timestamp'])
df = df.groupby(pd.PeriodIndex(df['timestamp'], freq="15min"))['power'].mean()

the result for this is:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

timestamp
2022-01-01 00:00    100.133526
2022-01-01 00:01    100.405105
2022-01-01 00:02    100.676684
2022-01-01 00:03    100.948263
2022-01-01 00:04    101.219842
                       ...    
2022-12-31 23:55    153.952833
2022-12-31 23:56    150.040333
2022-12-31 23:57    146.124167
2022-12-31 23:58    142.225833
2022-12-31 23:59    138.318167
Freq: 15T, Name: power, Length: 525600, dtype: float64

as you can see it is grouped as minutes, not as 15 min intervals.
When I try other freq like one day it works perfectly:

2022-01-01    120.291041
2022-01-02    126.085428
2022-01-03    120.840020
2022-01-04    124.335800
2022-01-05    119.230694
                 ...    
2022-12-27    125.802254
2022-12-28    123.833951
2022-12-29    126.609810
2022-12-30    123.971885
2022-12-31    122.798069
Freq: D, Name: power, Length: 365, dtype: float64

Also tested hours and many other freq and it works but I can not make it work for 15in intervals, is there any issue in my code? Thanks

>Solution :

For me working your solution correct, here is altenative with Series.dt.to_period:

df = pd.read_csv(f'{directory}/demand_profile_simulation.csv', parse_dates=['timestamp'])
df = df.groupby(df['timestamp'].dt.to_period('15Min'))['power'].mean()

Another solutions:

df = pd.read_csv(f'{directory}/demand_profile_simulation.csv', parse_dates=['timestamp'])
df = df.groupby(pd.Grouper(key='timestamp', freq="15min"))['power'].mean()
#alternative
#df = df.resample("15min", on='timestamp')['power'].mean()
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading