r getting second occurence of sequence

    ~SUBJID, ~TP.DATE, ~TPR_ar,
    '2617001', '2019-04-11', 'Undefined',
    '2617001', '2019-07-09', 'PD',       
    '2617001', '2019-09-07', 'PD',       
    '2617001', '2019-10-19', 'PD',      
    '2617001', '2019-11-12', 'PD',      
    '2617001', '2020-01-13', 'PR',      
    '2617001', '2020-02-24', 'PD',
    '2617001', '2020-03-24', 'PD',
)

Hi, stackoverflow!
I would like to get the specific date of above data.
You can see that for above data,
sequence of TPR_ar goes : ‘Undefined’, ‘PD’, ‘PR’, ‘PD’.
What I would like to do is get the second-first date of PD (2020-02-24).
Thanks in advance!

>Solution :

We could use rle or rleid to group adjacent similar elements

library(dplyr)
library(data.table)
df1 %>%
   group_by(grp = rleid(TPR_ar)) %>% 
   filter(TPR_ar == 'PD', row_number() == 1) %>% 
   ungroup %>%
   slice(2) %>%
   pull(TP.DATE)
[1] "2020-02-24"

If it is grouped by "SUBJID"

df1 %>%
   group_by(SUBJID, grp = rleid(TPR_ar)) %>% 
    filter(TPR_ar == 'PD', row_number() == 1) %>%
    group_by(SUBJID) %>% 
    slice(2) %>%
    pull(TP.DATE)

Leave a Reply