I have a dataframe with logged data values. The sampling period is much shorter than is required and I want to drop data until the sample period reaches a threshold. For example this df has ~10 second data. I want to delete rows until the difference between the current row and the prior row is >=60 seconds.
DateTime Value
3/1/2023 0:00:00 0.12
3/1/2023 0:00:03 0.12
3/1/2023 0:00:13 0.12
3/1/2023 0:00:23 0.12
3/1/2023 0:00:33 0.12
3/1/2023 0:00:43 0.12
3/1/2023 0:00:53 0.12
3/1/2023 0:01:03 0.12
3/1/2023 0:01:13 0.12
3/1/2023 0:01:23 0.12
3/1/2023 0:01:33 0.12
3/1/2023 0:01:43 0.13
3/1/2023 0:01:53 0.13
3/1/2023 0:02:03 0.13
3/1/2023 0:02:13 0.12
Desired output:
DateTime Value
3/1/2023 0:00:00 0.12
3/1/2023 0:01:03 0.12
3/1/2023 0:02:03 0.13
3/1/2023 0:02:13 0.12
I was going to write code with iterrows() function but the pandas documentation indicates I should never modify something I am iterating over. I am very new to python and pandas and it may not be the correct tool to complete this.
>Solution :
Is the first column a datetime type? If so you can use the resample function.
import pandas as pd
index = pd.date_range('1/1/2000', periods=9, freq='20s')
series = pd.Series(range(9), index=index)
#series
series.resample("60s").first()