Suppose I have a dataframe with timeseries data, and associated values:
| Date and Time | Value 1 | Gap in Time | Group |
|---|---|---|---|
| 2023-02-01 12:00:00 | X | 5 | 1 |
| 2023-02-01 12:05:00 | X | 5 | 1 |
| 2023-02-01 12:10:00 | X | 5 | 1 |
| 2023-02-01 12:15:00 | X | 5 | 1 |
| 2023-02-01 13:00:00 | X | 45 | 2 |
| 2023-02-01 13:05:00 | X | 5 | 1 |
| 2023-02-01 13:10:00 | X | 5 | 1 |
| 2023-02-01 16:00:00 | X | 175 | 2 |
| 2023-02-01 16:05:00 | X | 5 | 1 |
| 2023-02-01 16:10:00 | X | 5 | 1 |
I want to assign group nuymbers to the data where the gaps in time are less than 25 minutes. Using ifelse, I can assign groups to rows where the gap in time is less than 25 or greater than 25:
ifelse(df$Gap < 25, 1, 2)
But I would like to assign all subsequent observations the same group number until another gap in time greater than 25 is observed, as is follows:
| Date and Time | Value 1 | Gap in Time | Group |
|---|---|---|---|
| 2023-02-01 12:00:00 | X | 5 | 1 |
| 2023-02-01 12:05:00 | X | 5 | 1 |
| 2023-02-01 12:10:00 | X | 5 | 1 |
| 2023-02-01 12:15:00 | X | 5 | 1 |
| 2023-02-01 13:00:00 | X | 45 | 2 |
| 2023-02-01 13:05:00 | X | 5 | 2 |
| 2023-02-01 13:10:00 | X | 5 | 2 |
| 2023-02-01 16:00:00 | X | 175 | 3 |
| 2023-02-01 16:05:00 | X | 5 | 3 |
| 2023-02-01 16:10:00 | X | 5 | 3 |
I tried doing nested ifelse statements, but to no avail:
Group <- 1
ifelse(df$Gap < 25, Group,
ifelse(df$Gap >= 25, Group+1,
ifelse(df$Gap[ >= 25,(Group <- Group+1), Group)
Updated to fix a typo.
>Solution :
df$Group <- cumsum(df[['Gap in Time']] > 25) + 1
df
Date and Time Value 1 Gap in Time Group
1 2023-02-01 12:00:00 X 5 1
2 2023-02-01 12:05:00 X 5 1
3 2023-02-01 12:10:00 X 5 1
4 2023-02-01 12:15:00 X 5 1
5 2023-02-01 13:00:00 X 45 2
6 2023-02-01 13:05:00 X 5 2
7 2023-02-01 13:10:00 X 5 2
8 2023-02-01 16:00:00 X 175 3
9 2023-02-01 16:05:00 X 5 3
10 2023-02-01 16:10:00 X 5 3