Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

dplyr solution to exact and partial-string join

I need to join two datasets by two columns, one column by an exact match and the other one by a partial match:

pf=data.frame('exact'=c('s1','s2','s3','s2','s4'),'id_part'=c('a','a','a','b','c'), 'value'=c(1,2,3,4,5))
> pf
  exact id_part value
1    s1       a     1
2    s2       a     2
3    s3       a     3
4    s2       b     4
5    s4       c     5

and

cj=data.frame('exact'=c('s1','s1','s4','s2','s4'), 'id_part'=c('saf','r2a@ff','k5-a','6b4-d','ab1'))
> cj
  exact id_part
1    s1     saf
2    s1  r2a@ff
3    s4    k5-a
4    s2   6b4-d
5    s4     ab1

the desired outcome should be a datasets with the same rows as cj plus an additional columns with values form pf, in a way that if pf$exact==cj$exact & pf$id_part %in% cj$id_part,then output$value<- pf$value, else output$value<-0:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

> output
  exact id_part value
1    s1     saf     1
2    s1  r2a@ff     1
3    s4    k5-a     0
4    s2   6b4-d     4
5    s4     ab1     0

i.e. an exact match on the exact column and a partial match on the id_part column. I tried with stringdist_inner_join(cj,pf, by=c('exact','id_part'), method='lv') and similar, but got nowhere.

any help appreciated.

>Solution :

You can use regex_left_join:

library(fuzzyjoin)
library(dplyr)
library(tidyr)

regex_left_join(cj, pf) |>
  mutate(value = replace_na(value, 0))

#   exact.x id_part.x exact.y id_part.y value
# 1      s1       saf      s1         a     1
# 2      s1    r2a@ff      s1         a     1
# 3      s4      k5-a    <NA>      <NA>     0
# 4      s2     6b4-d      s2         b     4
# 5      s4       ab1    <NA>      <NA>     0
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading