Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How can I cross check two data frames in R using dplyr with multiple conditions?

I have a dataset that looks like this :

id1 sp
A SWO
B BFT
C HKE
D MUR
E OCC
F ALB
M ANN
N BFT

and a second one that contains the licenses of each id (but it could contain different ids)

id sp_lic
A SWO
B BFT
Z BFT
F SWO
S ALB
T SWO

The first data frame is the declared dataframe and the second one is the data frame that contains all the licenses paid.Now there are separate licenses for SWO for BFT and for ALB.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

I want to check if the ids in the first dataframe are licensed to have what they declare.
For example the id A is licensed to have SWO same as B but F is not licensed to have ALB because it’s licensed to have SWO.
The problem is that the N id has declared that has BFT but it is NA in the left_join() function in R and reports me "not needed" but I want to report "danger" because he/she has no license for that.

So there are three conditions here and ideally the final data frame must look like this:

id1 sp Check
A SWO ok
B BFT ok
C HKE not_needed
D MUR not_needed
E OCC not_needed
F ALB danger
M ANN not_needed
N BFT danger

How can I make this cross check and print these messages according to the logical condition in R using dplyr?

license = c("BFT","ALB","SWO")
library(tidyverse)
id =c("A","B","C","D","E","F","M","N")
sp = c("SWO","BFT","HKE","MUR","OCC","ALB","ANN","BFT")
data1 = tibble(id,sp);data1
id = c("A","B","Z","F","S","T")
sp_l = c("SWO","BFT","BFT","SWO","ALB","SWO")
data2 = tibble(id,sp_l);data2



left_join(data1, data2, by = "id")%>% 
 mutate(sp_l = case_when(is.na(sp_l) ~ "not_needed", 
                         sp == sp_l ~ "ok_checked", 
                         sp != sp_l ~ "danger", 
                         TRUE ~ NA_character_)) %>% 
 rename("check" = "sp_l")
# A tibble: 8 × 3
 id    sp    check     
 <chr> <chr> <chr>     
1 A     SWO   ok_checked
2 B     BFT   ok_checked
3 C     HKE   not_needed
4 D     MUR   not_needed
5 E     OCC   not_needed
6 F     ALB   danger    
7 M     ANN   not_needed
8 N     BFT   not_needed

>Solution :

We could use case_when with the provided logic:

library(dplyr)

data1 %>% 
  left_join(data2) %>% 
  mutate(sp_l = case_when(sp == sp_l ~ "ok",
                          sp %in% license | !is.na(sp_l) ~ "danger",
                          is.na(sp_l) ~ "not_needed", 
                          TRUE ~ NA_character_)
         )    
  id    sp    sp_l      
  <chr> <chr> <chr>     
1 A     SWO   ok        
2 B     BFT   ok        
3 C     HKE   not_needed
4 D     MUR   not_needed
5 E     OCC   not_needed
6 F     ALB   danger    
7 M     ANN   not_needed
8 N     BFT   danger  
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading