I have a data frame – each individual has multiple visits. I want to filter out patients who had disease 2 after disease 1. In this case, I would pick out ID 2 & 4.
ID <- c(1,1,2,2,2,3,3,3,4,4,4,4,5,5)
Visit <- c(1,2,1,2,3,1,2,3,1,2,3,4,1,2)
Disease <- c(2,2,1,2,1,1,1,1,1,1,1,2,2,1)
df <- data.frame(ID, Visit, Disease)
>Solution :
A dplyr solution:
library(dplyr)
df |>
filter(Disease == 1 & lead(Disease) == 2, .by = ID) |>
pull(ID)
Result:
[1] 2 4