How to format any column with 'date' in the title as a date

I have imported an excel spreadsheet into R and the data frame has numerous columns with date in the title. I can format a named column as aa date as follows:

df$date <- as.Date(as.numeric(df$date), origin = "1899-12-30")

How would I do this for all columns with ‘date’ in the title? Here’s an example data frame, though it doesn’t have anywhere near as many columns as the real thing. Ideally the answer would use dplyr.

df <- structure(list(source = c("Track", "Track", "Track", "Track", 
"Track"), sample_type = c("SQC", "DNA", "PBMC", "PBMC", "PBMC"
), collection_date = c("39646", "39654", "39643", "39644", "40389"
), collection_date2 = c("39646", "39654", "39643", "39644", "40389"
), received_date = c("39651", "39660", "39685", "39685", "40421"
), storage_date = c("39653", "39744", "39685", "39685", "40421"
)), row.names = c(NA, -5L), class = c("tbl_df", "tbl", "data.frame"
))

>Solution :

We can use across and contains to select all variables that contain the string "date".

library(tidyverse)
df <- structure(list(source = c("Track", "Track", "Track", "Track", 
                                "Track"), sample_type = c("SQC", "DNA", "PBMC", "PBMC", "PBMC"
                                ), collection_date = c("39646", "39654", "39643", "39644", "40389"
                                ), collection_date2 = c("39646", "39654", "39643", "39644", "40389"
                                ), received_date = c("39651", "39660", "39685", "39685", "40421"
                                ), storage_date = c("39653", "39744", "39685", "39685", "40421"
                                )), row.names = c(NA, -5L), class = c("tbl_df", "tbl", "data.frame"
                                ))

df <- df %>%
  mutate(across(contains("date"), ~as.Date(as.numeric(.x), origin = "1899-12-30")))

head(df)
#> # A tibble: 5 × 6
#>   source sample_type collection_date collection_date2 received_date storage_date
#>   <chr>  <chr>       <date>          <date>           <date>        <date>      
#> 1 Track  SQC         2008-07-17      2008-07-17       2008-07-22    2008-07-24  
#> 2 Track  DNA         2008-07-25      2008-07-25       2008-07-31    2008-10-23  
#> 3 Track  PBMC        2008-07-14      2008-07-14       2008-08-25    2008-08-25  
#> 4 Track  PBMC        2008-07-15      2008-07-15       2008-08-25    2008-08-25  
#> 5 Track  PBMC        2010-07-30      2010-07-30       2010-08-31    2010-08-31

Leave a Reply