Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

R keeping first/last dates within two groups

I have a dataset similar to the below:

id  date        event
1   01/02/2016  paracetamol
1   02/03/2016  paracetamol
1   16/04/2015  ibuprofen
2   01/03/2018  paracetamol
2   10/03/2018  codeine
2   15/02/2019  codeine
2   22/02/2019  codeine

And I want to:

  1. create two columns containing the first and last instances of each event and for each person
  2. remove duplicate events within each person

Resulting in the below:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

id  event        first_event  last_event
1   paracetamol  01/02/2016   02/03/2016
1   ibuprofen    16/04/2015   16/04/2015
2   paracetamol  01/03/2018   01/03/2018
2   codeine      10/03/2018   22/02/2019

Any suggestions on how to achieve this? I’ve been playing around functions like group_by and arrange, but can’t seem to get the combination right.

>Solution :

We may first convert the ‘date’ to Date class and get the min/max of ‘date’ after grouping by the columns

library(dplyr)
library(lubridate)
df1 %>%
    mutate(date = dmy(date)) %>%
    group_by(id, event) %>%
    summarise(first_event = min(date), last_event = max(date), .groups = 'drop')

-output

# A tibble: 4 × 4
     id event       first_event last_event
  <int> <chr>       <date>      <date>    
1     1 ibuprofen   2015-04-16  2015-04-16
2     1 paracetamol 2016-02-01  2016-03-02
3     2 codeine     2018-03-10  2019-02-22
4     2 paracetamol 2018-03-01  2018-03-01

data

df1 <- structure(list(id = c(1L, 1L, 1L, 2L, 2L, 2L, 2L), date = c("01/02/2016", 
"02/03/2016", "16/04/2015", "01/03/2018", "10/03/2018", "15/02/2019", 
"22/02/2019"), event = c("paracetamol", "paracetamol", "ibuprofen", 
"paracetamol", "codeine", "codeine", "codeine")),
 class = "data.frame", row.names = c(NA, 
-7L))
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading