I have to find the mean of the value x based on two categories:
set.seed(42) ## for sake of reproducibility
n <- 12
dat <- data.frame(group=rep(LETTERS[1:2], n/2),
type=rep(1:3,times=4), x=rnorm(n))
group type x
1 A 1 1.37095845
2 B 2 -0.56469817
3 A 3 0.36312841
4 B 1 0.63286260
5 A 2 0.40426832
6 B 3 -0.10612452
7 A 1 1.51152200
8 B 2 -0.09465904
9 A 3 2.01842371
10 B 1 -0.06271410
11 A 2 1.30486965
12 B 3 2.28664539
and I’d like an output like:
(for clarity, ‘and so on’ means the same mean calculation from the first two rows, I was too lazy to do it by hand)
group type mean
1 A 1 1.441240225
2 A 2 0.854568985
3 A 3 and so on
4 B 1 and so on
5 B 2 and so on
6 B 3 and so on
I have looked into dplyr tools that will help me get the mean of x by type OR by group, but not by both simultaneously.
>Solution :
library(dplyr)
dat %>%
group_by(group, type) %>%
summarize(x = mean(x), .groups = "drop")
Or in base R with aggregate
aggregate(x ~ group + type, dat, mean)
Output
# A tibble: 6 x 3
group type x
<chr> <int> <dbl>
1 A 1 1.44
2 A 2 0.855
3 A 3 1.19
4 B 1 0.285
5 B 2 -0.330
6 B 3 1.09