Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How to replace column names based on partial string match in R?

I have a integer variable period that I use to make dummy columns as follows:

library(fastDummies)

df <- data.frame(period = c(-3, -2, -1, 0, 1, 2, 3))

df <- dummy_cols(df, select_columns = "period")

 period period_-1 period_-2 period_-3 period_0 period_1 period_2 period_3
     -3         0         0         1        0        0        0        0
     -2         0         1         0        0        0        0        0
     -1         1         0         0        0        0        0        0
      0         0         0         0        1        0        0        0
      1         0         0         0        0        1        0        0
      2         0         0         0        0        0        1        0
      3         0         0         0        0        0        0        1

I would like to replace the names of the columns for the dummies of negative and positive values of period to "lag" and "lead" respectively. The ideal output (with period_0 manually renamed to event):

 period      lag1      lag2      lag3    event    lead1    lead2    lead3
     -3         0         0         1        0        0        0        0
     -2         0         1         0        0        0        0        0
     -1         1         0         0        0        0        0        0
      0         0         0         0        1        0        0        0
      1         0         0         0        0        1        0        0
      2         0         0         0        0        0        1        0
      3         0         0         0        0        0        0        1

Any help is appreciated, thanks!

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

>Solution :

Double-sub will work:

colnames(df) <- sub("period_-", "lag", colnames(df)) |>
  sub("period_", "lead", x = _)
colnames(df)[colnames(df) == "lead0"] <- "event"
df
#   period lag1 lag2 lag3 event lead1 lead2 lead3
# 1     -3    0    0    1     0     0     0     0
# 2     -2    0    1    0     0     0     0     0
# 3     -1    1    0    0     0     0     0     0
# 4      0    0    0    0     1     0     0     0
# 5      1    0    0    0     0     1     0     0
# 6      2    0    0    0     0     0     1     0
# 7      3    0    0    0     0     0     0     1

With dplyr:

library(dplyr)
df %>%
  rename_with(.fn = ~
    if_else(.x == "period_0", "event",
            sub("period_", "lead", sub("period_-", "lag", .x))))
#   period lag1 lag2 lag3 event lead1 lead2 lead3
# 1     -3    0    0    1     0     0     0     0
# 2     -2    0    1    0     0     0     0     0
# 3     -1    1    0    0     0     0     0     0
# 4      0    0    0    0     1     0     0     0
# 5      1    0    0    0     0     1     0     0
# 6      2    0    0    0     0     0     1     0
# 7      3    0    0    0     0     0     0     1

Starting data

df <- structure(list(period = -3:3, "period_-1" = c(0L, 0L, 1L, 0L, 0L, 0L, 0L), "period_-2" = c(0L, 1L, 0L, 0L, 0L, 0L, 0L), "period_-3" = c(1L, 0L, 0L, 0L, 0L, 0L, 0L), period_0 = c(0L, 0L, 0L, 1L, 0L, 0L, 0L), period_1 = c(0L, 0L, 0L, 0L, 1L, 0L, 0L), period_2 = c(0L, 0L, 0L, 0L, 0L, 1L, 0L), period_3 = c(0L, 0L, 0L, 0L, 0L, 0L, 1L)), class = "data.frame", row.names = c(NA, -7L))
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading