Get mean of every n rows for multiple columns R

August 8, 2023

I feel like this should be straightforward but I can’t find an existing answer to my question. I have a df:

df <- data.frame(ID = c('a', 'b', 'c', 'c1', 'd', 'e', 'f', 'g', 'h', 'h1'),
                 var2 = c(7, 9, 2, 4, 3, 6, 8, 2, 1, 2),
                 var3 = c(21, 50, 40, 30, 29, 45, 33, 51, 70, 46))

And I’d like to get the mean of every n rows for columns var2 and var3 separately, so that the output looks like this:

  var2 var3
1  8.0 35.5
2  3.0 35.0
3  4.5 37.0
4  5.0 42.0
5  1.5 58.0

It would be a bonus if I could keep the first ID of the two rows, e.g:

  ID var2 var3
1  a  8.0 35.5
2  c  3.0 35.0
3  d  4.5 37.0
4  f  5.0 42.0
5  h  1.5 58.0

Ty in advance

>Solution :

We need to add a grouping column, and then this is a standard grouped mean:

library(dplyr)
n = 2
df |>
  mutate(group = ((row_number() - 1) %/% n) + 1) |>
  summarize(
    first_id = first(ID),
    across(starts_with("var"), mean),
    .by = group
  )
#   group first_id var2 var3
# 1     1        a  8.0 35.5
# 2     2        c  3.0 35.0
# 3     3        d  4.5 37.0
# 4     4        f  5.0 42.0
# 5     5        h  1.5 58.0