I have the dataframe below. event_timestamp is a column of type dtype: datetime64[ns].
event_timestamp value
2024-02-02 09:29:19.623481531 8
2024-02-02 09:29:19.907333355 9
2024-02-02 09:29:19.907373437 10
2024-02-02 09:29:21.366842178 11
2024-02-02 09:29:21.366886264 12
2024-02-02 09:29:21.512928275 13
2024-02-02 09:29:21.512968294 14
2024-02-02 09:29:23.050536162 15
2024-02-02 09:29:23.300983260 16
2024-02-02 09:29:23.318874509 17
2024-02-02 09:29:23.318916726 18
What I am trying to achieve: For each row, calculate the average of ‘value’ in the next 1s.
Expected output:
- 1st row: avg(8,9,10) = 9
- 2nd row: avg(9,10) = 9.5
- 3rd row: avg(10) = 10
- 4th row: avg(11,12,13,14) = 12.5
- 5th row: avg(2,13,14) = 13
I tried with the below but no luck, the output was in the ‘avg_value_next_1s’ was wrong, though I can’t tell what it’s doing:
data = {
'event_timestamp': ['2024-02-02 09:29:19.623481531', '2024-02-02 09:29:19.907333355', '2024-02-02 09:29:19.907373437', '2024-02-02 09:29:21.366842178', '2024-02-02 09:29:21.366886264', '2024-02-02 09:29:21.512928275', '2024-02-02 09:29:21.512968294', '2024-02-02 09:29:23.050536162', '2024-02-02 09:29:23.300983260'],
'value': [8, 9, 10, 11, 12, 13, 14, 15, 16]
}
# Create DataFrame
df = pd.DataFrame(data)
# Convert 'event_timestamp' column to timestamp object
df['event_timestamp'] = pd.to_datetime(df['event_timestamp'])
df = df.set_index("event_timestamp")
df['avg_value_next_1s'] = df['value'].rolling('1s').mean()
>Solution :
Since you want the next 1s, you should invert the input before rolling:
df['avg_value_next_1s'] = df.loc[::-1, 'new_column'].rolling('1s').mean()
Output:
new_column avg_value_next_1s
event_timestamp
2024-02-02 09:29:19.623481531 8 9.0
2024-02-02 09:29:19.907333355 9 9.5
2024-02-02 09:29:19.907373437 10 10.0
2024-02-02 09:29:21.366842178 11 12.5
2024-02-02 09:29:21.366886264 12 13.0
2024-02-02 09:29:21.512928275 13 13.5
2024-02-02 09:29:21.512968294 14 14.0
2024-02-02 09:29:23.050536162 15 15.5
2024-02-02 09:29:23.300983260 16 16.0