I have a dataframe that looks like this:
> dput(df)
structure(list(Person_ID = c(123L, 123L), Disease_Name = c("Heart Disease",
"Lung Disease"), Disease_start = c("4/11/17", "4/11/17"), Procedure_start = c("4/11/18",
"4/11/16")), class = "data.frame", row.names = c(NA, -2L))
I want to restructure the dataframe so that:
- If the
Disease_startis BEFOREProcedure_start, then convertDisease_Nameto a blank/NA cell - If the
Disease_startis AFTERProcedure_start, then leaveDisease_Name(don’t change anything)
The output dataset should look like this:
> dput(df2)
structure(list(Person_ID = c(123L, 123L), Disease_Name = c("",
"Lung Disease"), Disease_start = c("4/11/17", "4/11/17"), Procedure_start = c("4/11/18",
"4/11/16")), class = "data.frame", row.names = c(NA, -2L))
Thank you!
>Solution :
Use an ifelse or case_when
library(dplyr)
df %>%
mutate(Disease_Name = case_when(Disease_start < Procedure_start ~"",
TRUE ~ Disease_Name))
-output
Person_ID Disease_Name Disease_start Procedure_start
1 123 4/11/17 4/11/18
2 123 Lung Disease 4/11/17 4/11/16