I have a column DuchenneMarker with the values 0 and 1.
I want to find at least 10 consecutive 1s and mark them in a new column DuchenneSmiles like this:
DuchenneMarker DuchenneSmiles
0 0
0 0
0 0
1 0
1 0
0 0
1 1
1 1
1 1
1 1
1 1
1 1
1 1
1 1
1 1
1 1
1 1
I’ve already tried it with stats:filter() and dplyr::filter() but it didn’t worked.
Is there a function that I should use or a brute-force method?
>Solution :
With the tidyverse you could do something like this. It creates a dummy grouping variable run for each run of 0s or 1s and removes it at the end.
df %>%
group_by(run = cumsum(DuchenneMarker != lag(DuchenneMarker, default = 0))) %>%
mutate(DuchenneSmiles = 0L + (n() > 9 & DuchenneMarker == 1)) %>%
ungroup() %>%
select(-run)
# A tibble: 17 × 2
DuchenneMarker DuchenneSmiles
<int> <int>
1 0 0
2 0 0
3 0 0
4 1 0
5 1 0
6 0 0
7 1 1
8 1 1
9 1 1
10 1 1
11 1 1
12 1 1
13 1 1
14 1 1
15 1 1
16 1 1
17 1 1