Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Using cut.Date and Starting Weeks on Saturday in R

I currently am aggregating data in exactly 4-week intervals, but I need the weeks to start on Saturdays. I am using the cut.Date line below on below’s reproducible example to create a column indicating each 4-week window, but I am struggling to find a resource (in either dplyr or lubridate) to aggregate in exactly 4-week intervals beginning on Saturday.

Here is what I have:

library(tidyverse)

# Example

example <- structure(list(
  yr = c(2016, 2016, 2016, 2016, 2016, 2016, 2016, 
         2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016, 
         2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016, 
         2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016
  ), 
  date = structure(
    c(17054, 17055, 17056, 17057, 17058, 17059, 
      17060, 17061, 17062, 17063, 17064, 17065, 17066, 17067, 17068, 
      17069, 17070, 17071, 17072, 17073, 17074, 17075, 17076, 17077, 
      17078, 17079, 17080, 17081, 17082, 17083, 17084, 17085, 17086, 
      17087, 17088, 17089, 17090, 17091, 17092, 17093), class = "Date"), 
  day_of_week = structure(
    c(7L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 
      1L, 2L, 3L, 4L, 5L, 6L, 7L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 1L, 
      2L, 3L, 4L, 5L, 6L, 7L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 1L, 2L, 
      3L, 4L), 
    .Label = c("Sun", "Mon", "Tue", "Wed", "Thu", "Fri", 
               "Sat"), class = c("ordered", "factor"))), 
  row.names = c(NA, -40L), 
  class = c("tbl_df", "tbl", "data.frame"))



# Using this in a mutate to generate the week
example %>% 
  mutate(wk = cut.Date(date, breaks = "4 week", labels = FALSE)) %>%
  print()

And I would like it to look like the following:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

# Ideal output
output <- structure(list(
  yr = c(2016, 2016, 2016, 2016, 2016, 2016, 2016, 
         2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016, 
         2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016, 
         2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016), 
  date = structure(
    c(17054, 17055, 17056, 17057, 17058, 17059, 
      17060, 17061, 17062, 17063, 17064, 17065, 17066, 17067, 17068, 
      17069, 17070, 17071, 17072, 17073, 17074, 17075, 17076, 17077, 
      17078, 17079, 17080, 17081, 17082, 17083, 17084, 17085, 17086, 
      17087, 17088, 17089, 17090, 17091, 17092, 17093), class = "Date"), 
  day_of_week = structure(
    c(7L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 
      1L, 2L, 3L, 4L, 5L, 6L, 7L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 1L, 
      2L, 3L, 4L, 5L, 6L, 7L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 1L, 2L, 
      3L, 4L),  
    .Label = c("Sun", "Mon", "Tue", "Wed", "Thu", "Fri", "Sat"), 
    class = c("ordered", "factor")), 
  wk = c(1L, 1L, 1L, 
         1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
         1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 
         2L, 2L, 2L, 2L, 2L, 2L, 2L)), 
  row.names = c(NA, -40L),  
  class = c("tbl_df", "tbl", "data.frame"))

Again, the first 4-week period goes from 2016-09-10 to 2016-10-08 when using Saturdays, which I need. Hit me with any questions you may have, and thanks in advance!

>Solution :

We’ll use integer-division (%/%) to determine how many 28-day periods occurred between "today’s date" and the first "Fri" in the data.

Up front, your output both starts and ends on a Saturday, which doesn’t seem to cycle consistently. I’ll assume that you want to start each 4-week cycle on a Saturday, so it’ll end on a Friday.

base R

example$wk <- with(example, as.numeric(date - date[which.min(day_of_week == "Fri")]) %/% 28 + 1L)
print(example, n = 40)
# # A tibble: 40 x 4
#       yr date       day_of_week    wk
#    <dbl> <date>     <ord>       <dbl>
#  1  2016 2016-09-10 Sat             1
#  2  2016 2016-09-11 Sun             1
#  3  2016 2016-09-12 Mon             1
#  4  2016 2016-09-13 Tue             1
#  5  2016 2016-09-14 Wed             1
#  6  2016 2016-09-15 Thu             1
#  7  2016 2016-09-16 Fri             1
#  8  2016 2016-09-17 Sat             1
#  9  2016 2016-09-18 Sun             1
# 10  2016 2016-09-19 Mon             1
# 11  2016 2016-09-20 Tue             1
# 12  2016 2016-09-21 Wed             1
# 13  2016 2016-09-22 Thu             1
# 14  2016 2016-09-23 Fri             1
# 15  2016 2016-09-24 Sat             1
# 16  2016 2016-09-25 Sun             1
# 17  2016 2016-09-26 Mon             1
# 18  2016 2016-09-27 Tue             1
# 19  2016 2016-09-28 Wed             1
# 20  2016 2016-09-29 Thu             1
# 21  2016 2016-09-30 Fri             1
# 22  2016 2016-10-01 Sat             1
# 23  2016 2016-10-02 Sun             1
# 24  2016 2016-10-03 Mon             1
# 25  2016 2016-10-04 Tue             1
# 26  2016 2016-10-05 Wed             1
# 27  2016 2016-10-06 Thu             1
# 28  2016 2016-10-07 Fri             1
# 29  2016 2016-10-08 Sat             2
# 30  2016 2016-10-09 Sun             2
# 31  2016 2016-10-10 Mon             2
# 32  2016 2016-10-11 Tue             2
# 33  2016 2016-10-12 Wed             2
# 34  2016 2016-10-13 Thu             2
# 35  2016 2016-10-14 Fri             2
# 36  2016 2016-10-15 Sat             2
# 37  2016 2016-10-16 Sun             2
# 38  2016 2016-10-17 Mon             2
# 39  2016 2016-10-18 Tue             2
# 40  2016 2016-10-19 Wed             2

dplyr

library(dplyr)
example %>%
  mutate(
    wk = as.numeric(date - date[which.min(day_of_week == "Fri")]) %/% 28 + 1L
  ) %>%
  print(n=99)
# # A tibble: 40 x 4
#       yr date       day_of_week    wk
#    <dbl> <date>     <ord>       <dbl>
#  1  2016 2016-09-10 Sat             1
#  2  2016 2016-09-11 Sun             1
#  3  2016 2016-09-12 Mon             1
#  4  2016 2016-09-13 Tue             1
#  5  2016 2016-09-14 Wed             1
#  6  2016 2016-09-15 Thu             1
#  7  2016 2016-09-16 Fri             1
#  8  2016 2016-09-17 Sat             1
#  9  2016 2016-09-18 Sun             1
# 10  2016 2016-09-19 Mon             1
# 11  2016 2016-09-20 Tue             1
# 12  2016 2016-09-21 Wed             1
# 13  2016 2016-09-22 Thu             1
# 14  2016 2016-09-23 Fri             1
# 15  2016 2016-09-24 Sat             1
# 16  2016 2016-09-25 Sun             1
# 17  2016 2016-09-26 Mon             1
# 18  2016 2016-09-27 Tue             1
# 19  2016 2016-09-28 Wed             1
# 20  2016 2016-09-29 Thu             1
# 21  2016 2016-09-30 Fri             1
# 22  2016 2016-10-01 Sat             1
# 23  2016 2016-10-02 Sun             1
# 24  2016 2016-10-03 Mon             1
# 25  2016 2016-10-04 Tue             1
# 26  2016 2016-10-05 Wed             1
# 27  2016 2016-10-06 Thu             1
# 28  2016 2016-10-07 Fri             1
# 29  2016 2016-10-08 Sat             2
# 30  2016 2016-10-09 Sun             2
# 31  2016 2016-10-10 Mon             2
# 32  2016 2016-10-11 Tue             2
# 33  2016 2016-10-12 Wed             2
# 34  2016 2016-10-13 Thu             2
# 35  2016 2016-10-14 Fri             2
# 36  2016 2016-10-15 Sat             2
# 37  2016 2016-10-16 Sun             2
# 38  2016 2016-10-17 Mon             2
# 39  2016 2016-10-18 Tue             2
# 40  2016 2016-10-19 Wed             2
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading