R: How to drop specific characters of string in column?

August 16, 2022

How can I drop "-" or double "–" only at the beginning of the value in the text column?

df <- data.frame (x  = c(12,14,15,178),
                  text = c("--Car","-Transport","Big-Truck","--Plane"))

    x       text
1  12      --Car
2  14 -Transport
3  15  Big-Truck
4 178    --Plane

Expected output:

    x       text
1  12        Car
2  14  Transport
3  15  Big-Truck
4 178      Plane

>Solution :

You can use gsub and the following regex "^\\-+". ^ states that the match should be at the beginning of the string, and that it should be 1 or more (+) hyphen (\\-).

gsub("^\\-+", "", df$text)
# [1] "Car"       "Transport" "Big-Truck" "Plane"

If there are whitespaces in the beginning of the string and you want to remove them, you can use [ -]+ in your regex. It tells to match if there are repeated whitespaces or hyphens in the beginning of your string.

gsub("^[ -]+", "", df$text)

To apply this to the dataframe, just do this. In tidyverse, you can also use str_remove:

df$text <- gsub("^\\-+", "", df$text)
# or, in dplyr
library(tidyverse)
df %>% 
  mutate(text1 = gsub("^\\-+", "", text),
         text2 = str_remove(text, "^\\-+"))