Create new data.frame consisting of average value of every two rows

January 6, 2023

I have data frame from an experiment similar to the following and would like to reduce the number of rows by half by collapsing every two rows of the data according to the "Attribute" value. Ultimately, for each participant, I want to create a new value for each of the five Rating columns that represents the average of each rating for each Attribute.

For example, for Participant_ID = 1, rows 1 and 2 would be collapsed because they both have the Attribute X, rows 3 and 4 would be collapsed because they both have the Attribute Y, rows 5 and 6 would be collapsed because they both have the Attribute Z, and so on for the other Participants.

# Example data
Participant_ID <- c(rep(1, 6), rep(2, 6), rep(3, 6))
Gender <- c(rep("M", 6), rep("F", 6), rep("M", 6))
Group <- c(rep("A", 6), rep("B", 6), rep("C", 6))
Attribute <- rep(c(rep("X", 2), rep("Y", 2), rep("Z", 2)), 3)
Rating_1 <- sample(1:5, 18, replace = TRUE)
Rating_2 <- sample(1:5, 18, replace = TRUE)
Rating_3 <- sample(1:5, 18, replace = TRUE)
Rating_4 <- sample(1:5, 18, replace = TRUE)
Rating_5 <- sample(1:5, 18, replace = TRUE)

dat <- data.frame(factor(Participant_ID), factor(Gender), factor(Group), 
factor(Attribute), Rating_1, Rating_2, Rating_3, Rating_4, Rating_5)

Can someone recommend a method for doing this?

>Solution :

We group by the factor columns, and create a run-length-id on the ‘Attribute.’ column so that its index increase at a different Attribute and then summarise across the integer columns to return the mean

library(dplyr)
library(data.table)
dat %>%
   group_by(across(where(is.factor)), grp = rleid(factor.Attribute.)) %>% 
  summarise(across(where(is.integer), mean), .groups = 'drop')

-output

# A tibble: 9 × 10
  factor.Participant_ID. factor.Gender. factor.Group. factor.Attribute.   grp Rating_1 Rating_2 Rating_3 Rating_4 Rating_5
  <fct>                  <fct>          <fct>         <fct>             <int>    <dbl>    <dbl>    <dbl>    <dbl>    <dbl>
1 1                      M              A             X                     1      5        4        1.5      2.5      1  
2 1                      M              A             Y                     2      3.5      1.5      2        4.5      3  
3 1                      M              A             Z                     3      2        4        2.5      2        2  
4 2                      F              B             X                     4      2        2.5      1.5      2.5      2.5
5 2                      F              B             Y                     5      1.5      5        4        2        3.5
6 2                      F              B             Z                     6      2        1.5      2        1        2.5
7 3                      M              C             X                     7      3        2        3.5      1.5      3  
8 3                      M              C             Y                     8      3        4        3.5      1.5      3.5
9 3                      M              C             Z                     9      4.5      4.5      4.5      2.5      5