Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How can you graph multiple overlapping 18 month periods with daily data?

I am doing an exploratory data analysis for data that is collected at the daily level over many years. The relevant time period is about 18 – 20 months from the same date each year. What I would like to do is visually inspect these 18 month periods one on top of the other. I can do this as below by adding data for each geom_point() call. I would like to avoid calling that one time for each period

min ex:

library(tidyverse)
minex <- data.frame(dts = seq((mdy('01/01/2010')), mdy('11/10/2013'), by = 'days')) 
minex$day <- as.numeric(minex$dts - min(minex$dts))
minex$MMDD <- paste0(month(minex$dts), "-", day(minex$dts))

minex$v1 <- 20 + minex$day^0.4 -cos(2*pi*minex$day/365) + rnorm(nrow(minex), 0, 0.3)

ggplot(filter(minex, dts %in% seq((mdy('11/10/2013') - (365 + 180)), mdy('11/10/2013'), by = 
  'days')), aes(day, v1)) +
     geom_point() +
geom_point(data = filter(minex, dts %in% seq((mdy('11/10/2012') - (365 + 180)), 
mdy('11/10/2012'), by = 'days')), aes(day+365, v1), color = 'red')

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

>Solution :

Since you have overlapping spans of time, I think we can lapply over your end dates, mutate the data a little, then use normal ggplot2 aesthetics to color them.

spans <- bind_rows(lapply(mdy("11/10/2010", "11/10/2011", "11/10/2012", "11/10/2013"), function(end) {
  filter(minex, between(dts, end - (365 + 180), end)) %>%
    mutate(day = day - min(day), end = end)
}))

ggplot(spans, aes(day, v1)) +
  geom_point(aes(color = factor(end)))

enter image description here

You can see the range of each with a quick summary:

spans %>%
  group_by(end) %>%
  summarize(startdate = min(dts), enddate = max(dts))
# # A tibble: 4 x 3
#   end        startdate  enddate   
#   <date>     <date>     <date>    
# 1 2010-11-10 2010-01-01 2010-11-10
# 2 2011-11-10 2010-05-14 2011-11-10
# 3 2012-11-10 2011-05-15 2012-11-10
# 4 2013-11-10 2012-05-14 2013-11-10
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading