Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Get mean of every n rows for multiple columns R

I feel like this should be straightforward but I can’t find an existing answer to my question. I have a df:

df <- data.frame(ID = c('a', 'b', 'c', 'c1', 'd', 'e', 'f', 'g', 'h', 'h1'),
                 var2 = c(7, 9, 2, 4, 3, 6, 8, 2, 1, 2),
                 var3 = c(21, 50, 40, 30, 29, 45, 33, 51, 70, 46))

And I’d like to get the mean of every n rows for columns var2 and var3 separately, so that the output looks like this:

  var2 var3
1  8.0 35.5
2  3.0 35.0
3  4.5 37.0
4  5.0 42.0
5  1.5 58.0

It would be a bonus if I could keep the first ID of the two rows, e.g:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

  ID var2 var3
1  a  8.0 35.5
2  c  3.0 35.0
3  d  4.5 37.0
4  f  5.0 42.0
5  h  1.5 58.0

Ty in advance

>Solution :

We need to add a grouping column, and then this is a standard grouped mean:

library(dplyr)
n = 2
df |>
  mutate(group = ((row_number() - 1) %/% n) + 1) |>
  summarize(
    first_id = first(ID),
    across(starts_with("var"), mean),
    .by = group
  )
#   group first_id var2 var3
# 1     1        a  8.0 35.5
# 2     2        c  3.0 35.0
# 3     3        d  4.5 37.0
# 4     4        f  5.0 42.0
# 5     5        h  1.5 58.0
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading