Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Rename multiple columns by extracting contents in parentheses for a list of dataframes

Given data and code as follows, I’m able to generate a list of dataframes dfs:

library(data.table)
library(purrr)
library(glue)
library(gt)
library(tidyverse)
library(purrr)

df <- structure(list(id = c("M0000607", "M0000609", "M0000612"), `2021-08(actual)` = c(12.6, 
                                                                                       19.2, 8.3), `2021-09(actual)` = c(10.3, 17.3, 6.4), `2021-10(actual)` = c(8.9, 
                                                                                                                                                                 15.7, 5.3), `2021-11(actual)` = c(7.3, 14.8, 3.1), `2021-12(actual)` = c(6.1, 
                                                                                                                                                                                                                                          14.2, 3.5), `2021-08(pred)` = c(11.65443222, 14.31674997, 7.084180415
                                                                                                                                                                                                                                          ), `2021-09(pred)` = c(12.29810914, 17.7143733, 6.057927385), 
                     `2021-10(pred)` = c(9.619846116, 15.54553601, 6.525992602
                     ), `2021-11(pred)` = c(8.352097939, 13.97318204, 3.164682627
                     ), `2021-12(pred)` = c(6.113631596, 14.16243166, 3.288372517
                     ), `2021-08(error)` = c(2.082307066, 1.146759554, 0.687406723
                     ), `2021-09(error)` = c(1.631350383, 2.753457736, 2.952737781
                     ), `2021-10(error)` = c(0.945567783, 4.883250027, 1.215819585
                     ), `2021-11(error)` = c(1.998109138, 0.414373304, 0.342072615
                     ), `2021-12(error)` = c(0.719846116, 0.154463985, 1.225992602
                     )), class = "data.frame", row.names = c(NA, -3L))

year_months <- c('2021-12', '2021-11', '2021-10')  
curr <- lubridate::ym(year_months)
prev <- curr - months(2L)
dfs <- mapply(function(x, y) {
  df[c(
    "id", 
    format(seq.Date(y, x, by = "month"), "%Y-%m(actual)"), 
    format(x, "%Y-%m(pred)"), 
    format(x, "%Y-%m(error)")
  )]
}, curr, prev, SIMPLIFY = FALSE)

I want to rename last two column by extracting content in parentheses as new column names:

[[1]]
        id 2021-10(actual) 2021-11(actual) 2021-12(actual)          pred          error
1 M0000607             8.9             7.3             6.1      6.113632      0.7198461
2 M0000609            15.7            14.8            14.2     14.162432      0.1544640
3 M0000612             5.3             3.1             3.5      3.288373      1.2259926

[[2]]
        id 2021-09(actual) 2021-10(actual) 2021-11(actual)          pred          error
1 M0000607            10.3             8.9             7.3      8.352098      1.9981091
2 M0000609            17.3            15.7            14.8     13.973182      0.4143733
3 M0000612             6.4             5.3             3.1      3.164683      0.3420726

[[3]]
        id 2021-08(actual) 2021-09(actual) 2021-10(actual)          pred          error
1 M0000607            12.6            10.3             8.9      9.619846      0.9455678
2 M0000609            19.2            17.3            15.7     15.545536      4.8832500
3 M0000612             8.3             6.4             5.3      6.525993      1.2158196

How could I do that using R? Thanks.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

Reference link:

Extract info inside all parenthesis in R

>Solution :

Using lapply to manipulate the elements of the list dfs, we can use some regex to do the job.

dfs <- lapply(dfs, function(x) {
  col_num <- grep('pred|error', colnames(x))
  col_names <- grep('pred|error', colnames(x), value = TRUE)
  colnames(x)[col_num] <- gsub('[0-9]+-[0-9]+\\(|\\)', '', col_names)
  x
})

Output

> dfs
[[1]]
        id 2021-10(actual) 2021-11(actual) 2021-12(actual)      pred     error
1 M0000607             8.9             7.3             6.1  6.113632 0.7198461
2 M0000609            15.7            14.8            14.2 14.162432 0.1544640
3 M0000612             5.3             3.1             3.5  3.288373 1.2259926

[[2]]
        id 2021-09(actual) 2021-10(actual) 2021-11(actual)      pred     error
1 M0000607            10.3             8.9             7.3  8.352098 1.9981091
2 M0000609            17.3            15.7            14.8 13.973182 0.4143733
3 M0000612             6.4             5.3             3.1  3.164683 0.3420726

[[3]]
        id 2021-08(actual) 2021-09(actual) 2021-10(actual)      pred     error
1 M0000607            12.6            10.3             8.9  9.619846 0.9455678
2 M0000609            19.2            17.3            15.7 15.545536 4.8832500
3 M0000612             8.3             6.4             5.3  6.525993 1.2158196
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading