I’ve the following table
| S/N | Unique ID | Code |
|---|---|---|
| 1 | 111 | YES |
| 2 | 111 | YES |
| 3 | 111 | NO |
| 4 | 111 | YES |
| 5 | 222 | YES |
| 6 | 222 | YES |
| 7 | 222 | YES |
| 8 | 222 | YES |
| 9 | 333 | NO |
| 10 | 333 | NO |
| 11 | 333 | YES |
| 12 | 333 | YES |
How do I derive the following table based on the following conditions:
For each unique ID, if YES repeats, keep the first YES. If NO Appears, keep the following YES. I tried using mutate and it’s giving me all sort of errors.
| S/N | Unique ID | Code |
|---|---|---|
| 1 | 111 | YES |
| 4 | 111 | YES |
| 5 | 222 | YES |
| 11 | 333 | YES |
Thanks!
>Solution :
base R
ind <- ave(dat$Code == "YES", dat$`Unique ID`,
FUN = function(z) z & c(TRUE, !z[-length(z)]))
dat[ind,]
# S/N Unique ID Code
# 1 1 111 YES
# 4 4 111 YES
# 5 5 222 YES
# 11 11 333 YES
dplyr
library(dplyr)
dat %>%
group_by(`Unique ID`) %>%
filter(Code == "YES" & lag(Code == "NO", default = TRUE)) %>%
ungroup()
# # A tibble: 4 x 3
# `S/N` `Unique ID` Code
# <int> <int> <chr>
# 1 1 111 YES
# 2 4 111 YES
# 3 5 222 YES
# 4 11 333 YES
data.table
library(data.table)
as.data.table(dat)[, .SD[Code == "YES" & shift(Code == "NO", fill = TRUE),], by = `Unique ID`]
# Unique ID S/N Code
# <int> <int> <char>
# 1: 111 1 YES
# 2: 111 4 YES
# 3: 222 5 YES
# 4: 333 11 YES
Data
dat <- structure(list("S/N" = 1:12, "Unique ID" = c(111L, 111L, 111L, 111L, 222L, 222L, 222L, 222L, 333L, 333L, 333L, 333L), Code = c("YES", "YES", "NO", "YES", "YES", "YES", "YES", "YES", "NO", "NO", "YES", "YES")), class = "data.frame", row.names = c(NA, -12L))