I’m using R 4.3.1
I have a data frame with several variables, including years. The years are formatted this way:
X1960..YR1960.
I would like to rename all the variables following this pattern to a simplified version:
Y1960
The dataframe contains variables from X1960..YR1960. to X2022..YR2022.
My current approach:
names(df) <- str_replace(names(df), "X\\d\\d\\d\\d\\.\\.YR*.", "Y")
Result of the current approach:
Y960.
I don’t understand the following things: Why is the first digit of the year omitted. Is the R a special character? If so, how can i escape it correctly?
How do I get rid of the last dot? I tried escaping it too, but that yielded no matches to the regex.
How works * exactly as a placeholder?
>Solution :
We can use sub() here for a simple base R option:
x <- "X1960..YR1960"
x_out <- sub("X(\\d{4})\\.\\.YR\\1", "\\1", x)
x_out