Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How to create variables in a dataframe as there are days between two dates?

In a data.frame, I would like to create as many variables as there are days between two dates.
Here’s my example below.
Many thanks in advance !

mydf <- data.frame(
  ident = c("a","b","c"),
  start = c("2023-01-01","2023-01-06","2023-01-24"),
  end = c("2023-01-03","2023-01-12","2023-01-30")
)

ident      start        end
1     a 2023-01-01 2023-01-03
2     b 2023-01-06 2023-01-12
3     c 2023-01-24 2023-01-30
# Expected output
mydf <- data.frame(
  ident = c("a","b","c"),
  start = c("2023-01-01","2023-01-03","2023-01-24"),
  end = c("2023-01-03","2023-01-12","2023-01-30"),
  `Sunday 1 January` = c(1,0,0),
  `Monday 2 January` = c(1,0,0),
  `Tuesday 3 January` = c(1,1,0),
  `Wednesday 4 January` = c(0,1,0)
)
  ident      start        end Sunday.1.January Monday.2.January Tuesday.3.January Wednesday.4.January
1     a 2023-01-01 2023-01-03                1                1                 1                   0
2     b 2023-01-03 2023-01-12                0                0                 1                   1
3     c 2023-01-24 2023-01-30                0                0                 0                   0

>Solution :

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

Creating columns where the column name is effectively "data" is generally a bad idea for several reasons; it’s generally better to work with the data in a long format and then pivot into something with "date-like column names" just for rendering. But if you really need it, then

library(dplyr)
library(tidyr) # unnest, pivot_wider
mydf %>%
  mutate(
    z = 1L,                              # uses as the "value" later
    across(c(start,end), as.Date),       # because we need seq.Date
    dates = Map(seq.Date, start, end, list(by = "day"))
  ) %>%
  unnest(dates) %>%
  pivot_wider(
    c(ident, start, end), names_from = "dates",
    values_from = "z", values_fill = 0L)
# # A tibble: 3 x 20
#   ident start      end        `2023-01-01` `2023-01-02` `2023-01-03` `2023-01-06` `2023-01-07` `2023-01-08` `2023-01-09` `2023-01-10` `2023-01-11` `2023-01-12` `2023-01-24` `2023-01-25` `2023-01-26` `2023-01-27` `2023-01-28`
#   <chr> <date>     <date>            <int>        <int>        <int>        <int>        <int>        <int>        <int>        <int>        <int>        <int>        <int>        <int>        <int>        <int>        <int>
# 1 a     2023-01-01 2023-01-03            1            1            1            0            0            0            0            0            0            0            0            0            0            0            0
# 2 b     2023-01-06 2023-01-12            0            0            0            1            1            1            1            1            1            1            0            0            0            0            0
# 3 c     2023-01-24 2023-01-30            0            0            0            0            0            0            0            0            0            0            1            1            1            1            1
# # ... with 2 more variables: 2023-01-29 <int>, 2023-01-30 <int>
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading