Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How to use regex in str_replace (stringr) to rename variables in R

I’m using R 4.3.1

I have a data frame with several variables, including years. The years are formatted this way:
X1960..YR1960.
I would like to rename all the variables following this pattern to a simplified version:
Y1960

The dataframe contains variables from X1960..YR1960. to X2022..YR2022.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

My current approach:

names(df) <- str_replace(names(df), "X\\d\\d\\d\\d\\.\\.YR*.", "Y")

Result of the current approach:
Y960.

I don’t understand the following things: Why is the first digit of the year omitted. Is the R a special character? If so, how can i escape it correctly?
How do I get rid of the last dot? I tried escaping it too, but that yielded no matches to the regex.
How works * exactly as a placeholder?

>Solution :

We can use sub() here for a simple base R option:

x <- "X1960..YR1960"
x_out <- sub("X(\\d{4})\\.\\.YR\\1", "\\1", x)
x_out
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading