Home How to take the mean of two subsequent rows iteratively thereby reducing the number of rows?

Questions

How to take the mean of two subsequent rows iteratively thereby reducing the number of rows?

July 12, 2022

I have a tibble like so:

library(dplyr)
set.seed(1)
my_tib <- tibble(identifier = rep(letters[1:3], each = 4),
       year = rep(seq(2005, 2020, 5), 3),
       value = rnorm(12, mean = 1000, 100) %>% round()
       )

my_tib

# A tibble: 12 × 3
   identifier  year value
   <chr>      <dbl> <dbl>
 1 a           2005   937
 2 a           2010  1018
 3 a           2015   916
 4 a           2020  1160
 5 b           2005  1033
 6 b           2010   918
 7 b           2015  1049
 8 b           2020  1074
 9 c           2005  1058
10 c           2010   969
11 c           2015  1151
12 c           2020  1039

Now I’d like to shrink down my tibble by taking the mean value for two years each, creating a new column for the year bracket. For example, I’d like to take the mean of 937 and 1018 (977.5) for the new year_bracket 2005-2010.

I’d like to repeat this for all years and all identifiers.

So the first new 5 rows of my tibble look like this:

head(my_new_tib, 5)

# A tibble: 9 × 3
   identifier  year_bracket  value       
   <chr>       <chr>         <dbl>
 1 a           2005-2010     977.5
 2 a           2010-2015     967
 3 a           2015-2020     1038
 4 b           2005-2010     975.5
 5 b           2010-2015     983.5

Ideally, I’m looking for a piped dplyr solution but I’m also curious regarding other solutions.

>Solution :

Using dplyr:

library(dplyr)
my_tib |> 
  group_by(identifier) |> 
  mutate(value = (value + lag(value))/2,
         year_bracket = paste0(lag(year)," - ",year),
         .keep = "unused",
         .before = 2) |> 
  filter(!is.na(value)) |> 
  ungroup()

Output:

# A tibble: 9 x 3
  identifier year_bracket value
  <chr>      <chr>        <dbl>
1 a          2005 - 2010   978.
2 a          2010 - 2015   967 
3 a          2015 - 2020  1038 
4 b          2005 - 2010   976.
5 b          2010 - 2015   984.
6 b          2015 - 2020  1062.
7 c          2005 - 2010  1014.
8 c          2010 - 2015  1060 
9 c          2015 - 2020  1095