R Create event ID based on category group

July 26, 2022

I have a column (called eventCat) in my data frame of 5 factors (Drought, Dry, Normal, Wet, Storm) e.g.

eventCat
Dry
Dry
Drought
Drought
Wet
Storm
Storm 
Normal 
Normal
Dry
Dry

I want to provide an ID to each group of events, so that the df looks like this (Note different IDs for the different Dry events):

eventCat          eventCatID
Dry               1
Dry               1
Drought           2
Drought           2
Wet               3
Storm             4
Storm             4
Normal            5
Normal            5
Dry               6
Dry               6

>Solution :

For this example you could increase the eventCatID by one every time eventCat is different from the previous eventCat (no change if it’s the same), e.g.

library(dplyr)

df <- structure(list(eventCat = c("Dry", "Dry", "Drought", "Drought", 
                                  "Wet", "Storm", "Storm", "Normal",
                                  "Normal", "Dry", "Dry")),
                class = "data.frame", row.names = c(NA, -11L))

df %>%
  mutate(eventCatID = 1 + cumsum(eventCat != lag(eventCat, default = first(eventCat))))
#>    eventCat eventCatID
#> 1       Dry          1
#> 2       Dry          1
#> 3   Drought          2
#> 4   Drought          2
#> 5       Wet          3
#> 6     Storm          4
#> 7     Storm          4
#> 8    Normal          5
#> 9    Normal          5
#> 10      Dry          6
#> 11      Dry          6

^{Created on 2022-07-26 by the reprex package (v2.0.1)}

But this relies on the eventCat’s being in the ‘right’ order. Does this work with your real data?