I have a data frame A, say :
| id | var |
|---|---|
| A | phone |
| B | car |
| C | bike |
| D | chair |
| E | mouse |
| Z | NA |
now I have another data frame B, that contain all the registered products by the id:
| id | var2 |
|---|---|
| A | phone |
| A | car |
| B | NA |
| C | bike |
| C | phone |
| D | chair |
| D | phone |
| D | car |
| E | chair |
| E | phone |
| E | car |
| Z | car |
| H | bike |
I want to check if the values declared in the data frame A exist in the registration data frame B by id.Actually to mutate a column that will check the condition by id and will return TRUE or FALSE accordingly.
Ideally the resulted output must de a data frame in R that will look like this :
| id | var | CHECK |
|---|---|---|
| A | phone | TRUE |
| B | car | FALSE |
| C | bike | TRUE |
| D | chair | TRUE |
| E | mouse | FALSE |
| Z | NA | FALSE |
the data in R are the following:
library(dplyr)
id = c("A","B","C","D","E","Z")
var = c("phone","car","bike","chair","mouse",NA);var
A = tibble(id,var);A
id = c(rep("A",2),"B",rep("C",2),rep("D",3),rep("E",3),"Z","H")
var2 = c("phone","car",NA,"bike","phone","chair","phone","car","chair","phone","car","car","bike")
B = tibble(id,var2);B
Any help?
>Solution :
You can join the two dataframes and check if all the values exist in another.
library(dplyr)
A %>%
left_join(B, by = 'id') %>%
group_by(id, var) %>%
summarise(CHECK = all(var %in% var2), .groups = "drop")
# id var CHECK
# <chr> <chr> <lgl>
#1 A phone TRUE
#2 B car FALSE
#3 C bike TRUE
#4 D chair TRUE
#5 E mouse FALSE
#6 Z NA FALSE