I have imported an excel spreadsheet into R and the data frame has numerous columns with date in the title. I can format a named column as aa date as follows:
df$date <- as.Date(as.numeric(df$date), origin = "1899-12-30")
How would I do this for all columns with ‘date’ in the title? Here’s an example data frame, though it doesn’t have anywhere near as many columns as the real thing. Ideally the answer would use dplyr.
df <- structure(list(source = c("Track", "Track", "Track", "Track",
"Track"), sample_type = c("SQC", "DNA", "PBMC", "PBMC", "PBMC"
), collection_date = c("39646", "39654", "39643", "39644", "40389"
), collection_date2 = c("39646", "39654", "39643", "39644", "40389"
), received_date = c("39651", "39660", "39685", "39685", "40421"
), storage_date = c("39653", "39744", "39685", "39685", "40421"
)), row.names = c(NA, -5L), class = c("tbl_df", "tbl", "data.frame"
))
>Solution :
We can use across
and contains
to select all variables that contain the string "date".
library(tidyverse)
df <- structure(list(source = c("Track", "Track", "Track", "Track",
"Track"), sample_type = c("SQC", "DNA", "PBMC", "PBMC", "PBMC"
), collection_date = c("39646", "39654", "39643", "39644", "40389"
), collection_date2 = c("39646", "39654", "39643", "39644", "40389"
), received_date = c("39651", "39660", "39685", "39685", "40421"
), storage_date = c("39653", "39744", "39685", "39685", "40421"
)), row.names = c(NA, -5L), class = c("tbl_df", "tbl", "data.frame"
))
df <- df %>%
mutate(across(contains("date"), ~as.Date(as.numeric(.x), origin = "1899-12-30")))
head(df)
#> # A tibble: 5 × 6
#> source sample_type collection_date collection_date2 received_date storage_date
#> <chr> <chr> <date> <date> <date> <date>
#> 1 Track SQC 2008-07-17 2008-07-17 2008-07-22 2008-07-24
#> 2 Track DNA 2008-07-25 2008-07-25 2008-07-31 2008-10-23
#> 3 Track PBMC 2008-07-14 2008-07-14 2008-08-25 2008-08-25
#> 4 Track PBMC 2008-07-15 2008-07-15 2008-08-25 2008-08-25
#> 5 Track PBMC 2010-07-30 2010-07-30 2010-08-31 2010-08-31