# Calculate overall mean of multiple columns by group

I have several columns with reaction times (RT) per participant. This is a simplified version of the dataframe I have (in reality I have 20 RT columns):

``````df <- data.frame(id=c(1,1,1,5,5,5), RT_0=c(0.467,NA,0.634,0.416,NA,0.35), RT_1=c(0.3,0.35,0.301,0.267,0.3,0.449), RT_2=c(1.251,NA, 0.334,0.34, 0.317,0.42), RT_3=c(0.284,0.316,0.283,0.367,0.55,0.35),pos=c(1,6,2,4,2,6))

id  RT_0    RT_1    RT_2    RT_3    pos
1  0.467   0.3     1.251   0.284   1
1  NA      0.35    NA      0.316   6
1  0.634   0.301   0.334   0.283   2
5  0.416   0.267   0.34    0.367   4
5  NA      0.3     0.317   0.55    2
5  0.35    0.449   0.42    0.35    6
``````

I want to calculate the overall mean of the RT columns by id. For example, the mean of all reaction times of participant nº1 was 0.452.

Something like this:

``````id  RT_0    RT_1    RT_2    RT_3    pos  mean
1  0.467   0.3     1.251   0.284   1    0.452
1  NA      0.35    NA      0.316   6    0.452
1  0.634   0.301   0.334   0.283   2    0.452
5  0.416   0.267   0.34    0.367   4    0.375
5  NA      0.3     0.317   0.55    2    0.375
5  0.35    0.449   0.42    0.35    6    0.375
``````

### >Solution :

We may group by ‘id’, select the columns that `starts_with` ‘RT_’ in column names within `across` (or `pick`), `unlist` the columns to a `vector` and get the overall mean with `mean` and assign it as new column in `mutate`

``````library(dplyr)
df %>%
group_by(id) %>%
mutate(mean = mean(unlist(across(starts_with("RT_"))),
na.rm = TRUE)) %>%
ungroup
``````

-output

``````# A tibble: 6 × 7
id   RT_0  RT_1   RT_2  RT_3   pos  mean
<dbl>  <dbl> <dbl>  <dbl> <dbl> <dbl> <dbl>
1     1  0.467 0.3    1.25  0.284     1 0.452
2     1 NA     0.35  NA     0.316     6 0.452
3     1  0.634 0.301  0.334 0.283     2 0.452
4     5  0.416 0.267  0.34  0.367     4 0.375
5     5 NA     0.3    0.317 0.55      2 0.375
6     5  0.35  0.449  0.42  0.35      6 0.375
``````

Or with `data.table`

``````library(data.table)
setDT(df)[, mean := mean(unlist(.SD), na.rm = TRUE),
.(id), .SDcols = patterns("^RT_")]
``````

-output

``````> df
id  RT_0  RT_1  RT_2  RT_3 pos      mean
1:  1 0.467 0.300 1.251 0.284   1 0.4520000
2:  1    NA 0.350    NA 0.316   6 0.4520000
3:  1 0.634 0.301 0.334 0.283   2 0.4520000
4:  5 0.416 0.267 0.340 0.367   4 0.3750909
5:  5    NA 0.300 0.317 0.550   2 0.3750909
6:  5 0.350 0.449 0.420 0.350   6 0.3750909
``````