Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Take differences between dates within id over a sequence (different rows) in R

I would like – for each id (cpf) – to take the differences, in months, between hire_date and sep_date on the previous row. For example, I would like to take the difference between hire_date associated with Order 1 and sep_date associated with order 2 (and the same for ids with more than two Order values).

Not all observations have just two order values. Some may have significantly more. How can I write a code that accounts for that? In this case there are more than two rows for a id. So I would also need to make more than one difference.

I would always like to take a difference between a given hire_date (e.g. order 2) and the previous sep_Date (order 1) and so forth. For more than two rows: hire_date (order 3) – sep_date (order 2); hire_date (order 2) – sep_date (order 1)…

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

structure(list(cpf = c(234L, 234L, 245L, 245L, 245L, 555L, 555L
), hire_date = c("10-11-29", "13-7-29", "11-10-19", "13-3-20", 
"13-5-20", "10-02-18", "13-11-21"), sep_date = c("13-4-18", "13-8-29", 
"13-2-15", "13-4-20", NA, "13-10-20", NA), Order = c(1L, 2L, 
1L, 2L, 3L, 1L, 2L)), class = "data.frame", row.names = c(NA, 
-7L))

  cpf hire_date sep_date Order
1 234  10-11-29  13-4-18     1
2 234   13-7-29  13-8-29     2
3 245  11-10-19  13-2-15     1
4 245   13-3-20  13-4-20     2
5 245   13-5-20     <NA>     3
6 555  10-02-18 13-10-20     1
7 555  13-11-21     <NA>     2

Any help would be appreciated!

>Solution :

We can convert the date columns to Date class and do a group by difftime

library(dplyr)
library(lubridate)
df1 %>%
  mutate(across(hire_date:sep_date, dmy)) %>% 
  group_by(cpf) %>% 
  mutate(Month = as.numeric(difftime(hire_date, 
     lag(sep_date), unit = "weeks"))/4) %>%
  ungroup
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading