I have a column (called eventCat) in my data frame of 5 factors (Drought, Dry, Normal, Wet, Storm) e.g.
eventCat
Dry
Dry
Drought
Drought
Wet
Storm
Storm
Normal
Normal
Dry
Dry
I want to provide an ID to each group of events, so that the df looks like this (Note different IDs for the different Dry events):
eventCat eventCatID
Dry 1
Dry 1
Drought 2
Drought 2
Wet 3
Storm 4
Storm 4
Normal 5
Normal 5
Dry 6
Dry 6
>Solution :
For this example you could increase the eventCatID by one every time eventCat is different from the previous eventCat (no change if it’s the same), e.g.
library(dplyr)
df <- structure(list(eventCat = c("Dry", "Dry", "Drought", "Drought",
"Wet", "Storm", "Storm", "Normal",
"Normal", "Dry", "Dry")),
class = "data.frame", row.names = c(NA, -11L))
df %>%
mutate(eventCatID = 1 + cumsum(eventCat != lag(eventCat, default = first(eventCat))))
#> eventCat eventCatID
#> 1 Dry 1
#> 2 Dry 1
#> 3 Drought 2
#> 4 Drought 2
#> 5 Wet 3
#> 6 Storm 4
#> 7 Storm 4
#> 8 Normal 5
#> 9 Normal 5
#> 10 Dry 6
#> 11 Dry 6
Created on 2022-07-26 by the reprex package (v2.0.1)
But this relies on the eventCat’s being in the ‘right’ order. Does this work with your real data?