Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

New column based on if col1 is a substring of col2

I’m trying to make a new column based on whether one column is a substring of another. Using if_else & grepl works with a constant, but not comparing two columns to each other.

df <- data.frame(col1 = c("first street", "second st", "third st apt1"),
                 col2 = c("first street #6", "second st", "third st"))

df <- df %>% dplyr::mutate(test = if_else(grepl("st", col2,fixed=TRUE),1,0)) # WORKS
df <- df %>% dplyr::mutate(test2 = if_else(grepl(col1, col2,fixed=TRUE),1,0)) # ERROR

Warning message:
Problem with `mutate()` column `test`.
i `test = if_else(grepl(col1, col2, fixed = TRUE), 1, 0)`.
i argument 'pattern' has length > 1 and only the first element will be used 

>df
    col1            col2               test test2
1   first street    first street #6    1     1
2   second st       second st          1     0    <--- should be 1
3   third st apt1   third st           1     0

Why can’t I use both the variable columns in the grepl? It works fine under the mutate, for instance test3 = paste(col1, col2) returns the expected result.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

>Solution :

You could use rowwise() before the mutate or you could use str_detect() from stringr:

library(tidyverse)
df <- data.frame(col1 = c("first street", "second st", "third st apt1"),
                 col2 = c("first street #6", "2nd st", "third st"))

df <- df %>% rowwise() %>% dplyr::mutate(test2 = if_else(grepl(col1, col2,fixed=TRUE),1,0)) 
df
#> # A tibble: 3 × 3
#> # Rowwise: 
#>   col1          col2            test2
#>   <chr>         <chr>           <dbl>
#> 1 first street  first street #6     1
#> 2 second st     2nd st              0
#> 3 third st apt1 third st            0


df <- data.frame(col1 = c("first street", "second st", "third st apt1"),
                 col2 = c("first street #6", "2nd st", "third st"))

df <- df %>% dplyr::mutate(test2 = if_else(str_detect(col2, col1),1,0)) 
df
#>            col1            col2 test2
#> 1  first street first street #6     1
#> 2     second st          2nd st     0
#> 3 third st apt1        third st     0

Created on 2022-02-01 by the reprex package (v2.0.1)

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading