Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Concatenate columns in R, but only keep separator if value exists

Does anyone know of a method in R to concatenate n columns but only keep the separator if there is a value in that row? If you run the example below:

df <- data.frame(
                  name1 = c("Jim","Bob","Sue"),
                  name2 = c("Jane","","Bane"),
                  name3 = c('Conor',"",""),
                  name4 = c("","","Bonor")
                )

df$names <- paste(df$name1,df$name2,df$name3, sep=";")

You will see that separators are included at the end and in-between values even though the cells are empty, with the output:

df =

name1 name2 name3 name4  names
Jim   Jane  Conor        Jim;Jane;Conor;
Bob                      Bob;;;
Sue   Bane        Bonor  Sue;Bane;;Bonor

Is there any method to either not include or drop separators in the case where the cells are empty? With the desired outcome:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

df =

name1 name2 name3 name4  names
Jim   Jane  Conor        Jim;Jane;Conor
Bob                      Bob
Sue   Bane        Bonor  Sue;Bane;Bonor

>Solution :

library(dplyr)
library(tidyr)

df %>% 
  mutate_all(na_if,"") %>% 
  unite("names", everything(), sep = ";", remove = F, na.rm = T)

#>            names name1 name2 name3 name4
#> 1 Jim;Jane;Conor   Jim  Jane Conor  <NA>
#> 2            Bob   Bob  <NA>  <NA>  <NA>
#> 3 Sue;Bane;Bonor   Sue  Bane  <NA> Bonor

Update: applying this solution on specific columns.

I am modifying akrun’s answer from the comment below;

df %>% 
  mutate(across(c("name1", "name2", "name3", "name4"), na_if, "", 
                .names = "{.col}_changed")) %>% 
  unite(names, ends_with('_changed'), na.rm = TRUE, sep = ";")

#>   name1 name2 name3 name4          names
#> 1   Jim  Jane Conor       Jim;Jane;Conor
#> 2   Bob                              Bob
#> 3   Sue  Bane       Bonor Sue;Bane;Bonor
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading