I am working with a dataset with repeated observations within each treatment, and I need to find the mean value within each of the treatments. I would like to use dplyr. My data is as follows:
ex <- data.frame(plot = c(101,102,103,104,105,106,201,202,203,204,205,206,301,302,303,304,305,306),
trt = c("a","a","b","b","c","c","a","a","b","b","c","c","a","a","b","b","c","c"),
value = c(1,2,3,4,5,6,1,2,3,4,5,6,1,2,3,4,5,6))
The summarised dataset needs to be the mean of each treatment within the sub-groups. Here is what it should look like when completed:
correct <- data.frame(trt = c("a","b","c","a","b","c","a","b","c"),
rating = c(1.5,3.5,5.5,1.5,3.5,5.5,1.5,3.5,5.5))
Here is what I have tried:
library(dplyr)
example <- ex %>%
dplyr::select(plot, trt, value) %>%
group_by(trt) %>%
summarise(rating = mean(value), .groups = 'drop')
However, the following is produced:
incorrect_example <- data.frame(trt=c("a","b","c"),
rating=c(1.5,3.5,5.5))
How can I produce results like those indicated in the correct dataframe?
>Solution :
We may need rleid
library(dplyr)
library(data.table)
ex %>%
group_by(grp = rleid(trt), trt) %>%
summarise(rating = mean(value), .groups = 'drop') %>%
select(-grp)
-output
# A tibble: 9 × 2
trt rating
<chr> <dbl>
1 a 1.5
2 b 3.5
3 c 5.5
4 a 1.5
5 b 3.5
6 c 5.5
7 a 1.5
8 b 3.5
9 c 5.5
Or may also use %/% on the ‘plot’ to create a grouping column along with ‘trt’ as group, then get the mean of the ‘value’ column
ex %>%
group_by(grp = plot %/% 100, trt) %>%
summarise(value = mean(value), .groups = 'drop') %>%
select(-grp)
-output
# A tibble: 9 × 2
trt value
<chr> <dbl>
1 a 1.5
2 b 3.5
3 c 5.5
4 a 1.5
5 b 3.5
6 c 5.5
7 a 1.5
8 b 3.5
9 c 5.5