How to replace and delete different underscore in a single name row

Let’s suppose I have this situation

data = data.frame('A' = c('A_A_', 'B_B_'))

A_A_ where I would like to remove the final and replace the central underscore. What can I do to save the following two steps?

data %>% 
  mutate(A = sub("_$","", A)) %>% 
  mutate(A = sub("_","->", A))

Thanks

>Solution :

You could use sub() with capture groups:

data$A <- sub("([^_]+)_([^_]+)_", "\\1->\\2", data$A)

The regex pattern used here says to match:

  • ([^_]+) match and capture in \1 the first term
  • _ match the first underscore
  • ([^_]+) match and capture in \2 the second term
  • _ match the final underscore

Then we splice together the two segments separated by ->.

Leave a Reply