Home How to add a column to a dataframe based on the value of two other columns? R

Questions

How to add a column to a dataframe based on the value of two other columns? R

November 10, 2021

I have to find the mean of the value x based on two categories:


set.seed(42)  ## for sake of reproducibility
n <- 12
dat <- data.frame(group=rep(LETTERS[1:2], n/2),
type=rep(1:3,times=4), x=rnorm(n))


   group type           x
1      A    1  1.37095845
2      B    2 -0.56469817
3      A    3  0.36312841
4      B    1  0.63286260
5      A    2  0.40426832
6      B    3 -0.10612452
7      A    1  1.51152200
8      B    2 -0.09465904
9      A    3  2.01842371
10     B    1 -0.06271410
11     A    2  1.30486965
12     B    3  2.28664539

and I’d like an output like:
(for clarity, ‘and so on’ means the same mean calculation from the first two rows, I was too lazy to do it by hand)

  group type        mean
1     A    1 1.441240225
2     A    2 0.854568985
3     A    3   and so on
4     B    1   and so on
5     B    2   and so on
6     B    3   and so on

I have looked into dplyr tools that will help me get the mean of x by type OR by group, but not by both simultaneously.

>Solution :

library(dplyr)

dat %>% 
  group_by(group, type) %>% 
  summarize(x = mean(x), .groups = "drop")

Or in base R with aggregate

aggregate(x ~ group + type, dat, mean)

Output

# A tibble: 6 x 3
  group  type      x
  <chr> <int>  <dbl>
1 A         1  1.44 
2 A         2  0.855
3 A         3  1.19 
4 B         1  0.285
5 B         2 -0.330
6 B         3  1.09