Delete all remaining observations from data frame when exceeding a specific value

I want to delete some observations from my data frame. For each idividual (indicated by ID in the following), I have more than one obsersvation and therefore more than one row in the data frame.

The idea is to delete all remaining observations for one individual, when they exceed a specific (pre-defined) value of time.

Here is sone exmaple data to explain the problem:

ID <- c(1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,3,3,3,3,4,4,4,4,4,5,5,5,5,5,5,5,6,6,6,7,7)
time <- c(20,23,11,14,15,31, 14, 21, 7, 11, 45, 21, 35, 2, 90, 36, 18, 3, 35, 10, 5,19, 27, 12, 51,91,4,13,9,12,10,7,8,15,6,16)

sim_dat <- data.frame(ID, time)

Now imagine, my specific time-limit would be 30. Then I would like to delete all remaining observations for one individual, when it first exceeded the limit (and important: It doesn’t matter, if the deleted values where less or greater than 30).

To be more precise: In the simulated data frame, ID 1 has 8 observations and exceeded the value 30 at his 6th observation. Now, the observations 6,7,8 of ID 1 must be deleted.

For ID 3 no observations should be in the final data frame, as ID 3 exceeded the value 30 within the 1st observation. On the other hand, ID 4 doesn’t exceed the value 30, therefore no observations should be deleted.

I hope the problem is clear and I would appreciate any kind of feedback 🙂

Thank you

>Solution :

Update: Richie’s comment-answer is brilliant:

library(dplyr)

sim_dat |> 
  mutate(flag = cumsum(time > 30), .by = ID) |> 
  filter(flag == 0) |> 
  select(-flag)

Output:

   ID time
1   1   20
2   1   23
3   1   11
4   1   14
5   1   15
6   2    7
7   2   11
8   4   10
9   4    5
10  4   19
11  4   27
12  4   12
13  6    7
14  6    8
15  6   15
16  7    6
17  7   16

Leave a Reply