I have a dataframe grouped that contains rows assigned by a reference number like so:
ref origin delivery
1 x lux
1 lux y
1 x lux
2 z c
2 q w
3 p lux
3 lux t
4 k lux
4 k lux
I have been trying to find a way to extract all the rows that have the same ref and have lux in both columns at least once:
ref origin delivery
1 x lux
1 lux y
1 x lux
3 p lux
3 lux t
Tried doing this:
is_crossdock <- c()
for(i in 1: nrow(grouped)){
cmrf <- grouped$ref[i]
for(j in 1:nrow(grouped[which(grouped$ref== cmrf),])){
if(any(grouped$origin=='lux' && any(grouped$delivery=='lux'))){
is_crossdock = c(is_crossdock,i)
}
}
}
but result is incorrect
>Solution :
With dplyr
library(dplyr)
df %>%
group_by(ref) %>%
filter("lux" %in% origin & "lux" %in% delivery)
# # A tibble: 5 × 3
# # Groups: ref [2]
# ref origin delivery
# <int> <chr> <chr>
# 1 1 x lux
# 2 1 lux y
# 3 1 x lux
# 4 3 p lux
# 5 3 lux t
Or in base R:
lux_refs = intersect(
df[df$origin == "lux", "ref"],
df[df$delivery == "lux", "ref"]
)
df[df$ref %in% lux_refs, ]
# ref origin delivery
# 1 1 x lux
# 2 1 lux y
# 3 1 x lux
# 6 3 p lux
# 7 3 lux t
Using this sample data:
df = read.table(text = 'ref origin delivery
1 x lux
1 lux y
1 x lux
2 z c
2 q w
3 p lux
3 lux t
4 k lux
4 k lux', header = T)