Home R: Delete Rows After First "Break" Occurs

Questions

R: Delete Rows After First "Break" Occurs

February 3, 2023

I am working with the R programming language.

I have the following dataset:

library(dplyr)

my_data = data.frame(id = c(1,1,1,1,1,1, 2,2,2) , year = c(2010, 2011, 2012, 2013, 2015, 2016, 2015, 2016, 2020), var = c(1,7,3,9,5,6, 88, 12, 5)) 

> my_data
  id year var
1  1 2010   1
2  1 2011   7
3  1 2012   3
4  1 2013   9
5  1 2015   5
6  1 2016   6
7  2 2015  88
8  2 2016  12
9  2 2020   5

My Question: For each ID – I want to find out when the first "non-consecutive" year occurs, and then delete all remaining rows.

For example:

When ID = 1, the first "jump" occurs at 2013 (i.e. there is no 2014). Therefore, I would like to delete all rows after 2013.
When ID = 2, the first "jump" occurs at 2016 – therefore, I would like to delete all rows after 2016.

This was my attempt to write the code for this problem:

final = my_data %>%
  group_by(id) %>%
  mutate(break_index = which(diff(year) > 1)[1]) %>%
  group_by(id, add = TRUE) %>%
  slice(1:break_index)

The code appears to be working – but I get the following warning messages which are concerning me:

Warning messages:
1: In 1:break_index :
  numerical expression has 6 elements: only the first used
2: In 1:break_index :
  numerical expression has 3 elements: only the first used

Can someone please tell me if I have done this correctly?

Thanks!

>Solution :

You get the warning because break_index has more than 1 value which is the same value for each group so your attempt works. If you want to avoid the warning you can select any one value of break_index. Try with slice(1:break_index[1]) to slice(1:first(break_index)).

Here is another way to handle this.

library(dplyr)

my_data %>%
  group_by(id) %>%
  filter(row_number() <= which(diff(year) > 1)[1])

#     id  year   var
#  <dbl> <dbl> <dbl>
#1     1  2010     1
#2     1  2011     7
#3     1  2012     3
#4     1  2013     9
#5     2  2015    88
#6     2  2016    12

With dplyr 1.1.0, we can use temporary grouping with .by –

my_data %>%
  filter(row_number() <= which(diff(year) > 1)[1], .by = id)

byMR

Published February 03, 2023

Add a comment

C# global variables unchangeable outside of locality?

byMR

February 3, 2023

Questions

Replace value in column by previous value having pattern

byMR

February 3, 2023

Questions

How do put multiple values into a single key in Swift?

byMR

February 3, 2023

Questions

Disable automatic parameter suggestions in VSCode

byMR

February 3, 2023

Questions

how do I stop a conflict between inline-block and text align?

byMR

February 3, 2023

Questions

Homebrew/Node.js showing up though not installed

byMR

February 3, 2023

R: Delete Rows After First "Break" Occurs

MEDevel.com: Open-source for Healthcare and Education

>Solution :

Like this:

Leave a ReplyCancel reply

Read more

C# global variables unchangeable outside of locality?

Replace value in column by previous value having pattern

How do put multiple values into a single key in Swift?

Disable automatic parameter suggestions in VSCode

how do I stop a conflict between inline-block and text align?

Homebrew/Node.js showing up though not installed

Keep Up to Date with the Most Important News

R: Delete Rows After First "Break" Occurs

MEDevel.com: Open-source for Healthcare and Education

>Solution :

Share this:

Like this:

Leave a ReplyCancel reply

Keep Up to Date with the Most Important News

Read more

C# global variables unchangeable outside of locality?

Replace value in column by previous value having pattern

How do put multiple values into a single key in Swift?

Disable automatic parameter suggestions in VSCode

how do I stop a conflict between inline-block and text align?

Homebrew/Node.js showing up though not installed

Discover more from Dev solutions