R: loop over the dataframe and stop when specific condition is met

Advertisements

I have a dataframe df, which looks like this:

id status
601 2
601 2
601 2
601 4
601 2
601 2
601 4
601 2
990 2
990 4

First Output I want to have is:
I want to use a loop to filter over the id and that it stops, when per the number 4 occurs the first time per id:

so I want that it looks like this at the end:

id status
601 2
601 2
601 2
601 4
990 2
990 4

and the second output I want to have:
It should stop with 4, no matter how often it occurs in the original dataset. After 4 nothing else should come.

id status
601 2
601 2
601 2
601 4
601 2
601 2
601 4
990 2
990 4

I do not know how to do it? Maybe there is also a way with filtering?
I would really apreciate your help

>Solution :

To get the rows until the first 4, you can do:

library(dplyr)
df %>% 
  group_by(id) %>% 
  filter(!lag(cumany(status == 4), default = FALSE))

#     id status
#  <int>  <int>
#1   601      2
#2   601      2
#3   601      2
#4   601      4
#5   990      2
#6   990      4

And to get everything until the last 4, you can do:

df %>% 
  group_by(id) %>% 
  mutate(tmp = lag(cumsum(status == 4), default = FALSE)) %>% 
  filter(tmp < max(tmp) | tmp == 0) %>% 
  select(-tmp)

#      id status
# 1   601      2
# 2   601      2
# 3   601      2
# 4   601      4
# 5   601      2
# 6   601      2
# 7   601      4
# 8   990      2
# 9   990      4

Leave a Reply Cancel reply