Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How can I filter max value in a group and min value in another group?

I’m struggling with filtering options in R. I have this dataset:

patient_id    period     TREAT_CAT               Outcome
1            -3228 days pre-treatment                Pink
1            -3170 days pre-treatment                Pink
1              100 days post-treatment               Blue
1              200 days post-treatment               Pink
2            -2900 days pre-treatment                Blue
2                0 days post-treatment               Pink
2              100 days post-treatment               Pink

structure(list(patient_id = c(1, 1, 1, 1, 2, 2, 2), 
    period = structure(c(-3228, -3170, 100, 200, -2900, 
    0, 100), class = "difftime", units = "days"), 
    TREAT_CAT = structure(c( 1L, 1L, 2L, 2L, 1L, 2L, 2L), levels = c("pre-treatment", "post-treatment"), class = "factor"), 
    Outcome = c("Pink", "Pink", "Blue", "Pink", 
    "Blue", "Pink", "Pink"), row.names = c(19L, 26L, 24L, 3L, 7L, 29L, 20L), class = "data.frame")

I would like to filter the closest "period" to 0 for the pre-treatment group and the farest "period" from 0 on the post-treatment group.

I’ve tried something like this

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

df2 <- df %>%
  group_by(patient_id) %>% 
  filter((TREAT_CAT=="pre-treatment" & period == min(period)) | (TREAT_CAT=="post-treatment" & period == max(period))) %>% 
  filter(n() == 2)

But obviously it gaves me the farest from 0 from both periods. I’ve also tried with max(period) for both groups but it’s not working because the max (period) it’s only happening for the post-treatment group, resulting in 0 variables.

I would expect something like

patient_id    period     TREAT_CAT               Outcome
1            -3170 days pre-treatment                Pink
1              200 days post-treatment               Pink
2            -2900 days pre-treatment                Blue
2              100 days post-treatment               Pink

Could you please help?

Thanks in advance

>Solution :

I assume that pre-treatment is always negative. That way, you can slice_max() on each group

df %>% 
  group_by(patient_id, TREAT_CAT) %>% 
  slice_max(period)

# A tibble: 4 × 4
# Groups:   patient_id, TREAT_CAT [4]
  patient_id period     TREAT_CAT      Outcome
       <dbl> <drtn>     <fct>          <chr>  
1          1 -3170 days pre-treatment  Pink   
2          1   200 days post-treatment Pink   
3          2 -2900 days pre-treatment  Blue   
4          2   100 days post-treatment Pink 
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading