My data frame has two string columns I want to compare. The second (V2) is a list. My DF looks like this:
V1 V2 V3
oranges c("oranges", "apples", "berries", "plums", "cherries") 1
apples c("oranges", "apples", "berries", "bananas", "apples") 2
grapes c("oranges", "apples", "berries", "plums", "cherries") 0
berries c("berries", "apples", "berries", "plums", "cherries") 2
I want to check V1 row wise against V2 and total the frequency the string appears in V3. I have tried using the following code but end up with an empty dataframe.
matches <- x[!x$V1 %in% x$V2]
>Solution :
V1 <- c("oranges", "apples", "grapes", "berries")
V2 <- list(c("oranges", "apples", "berries", "plums", "cherries"),
c("oranges", "apples", "berries", "bananas", "apples"), c("oranges",
"apples", "berries", "plums", "cherries"), c("berries", "apples",
"berries", "plums", "cherries"))
A straightforward solution is:
V3 <- mapply(function (x, y) sum(x == y), V1, V2)
#oranges apples grapes berries
# 1 2 0 2
Note that I could use ==, because V1 has single value each row.
If V2 has identical number of elements each row, I recommend:
V3 <- rowSums(V1 == do.call(rbind, V2))
#[1] 1 2 0 2