Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

ggplot: stat_summary for mean with facet

Within ggplot2, I am using the stat_summary() function to calculate and plot the mean and standard deviation of a dataset. I am simultaneously using facet_wrap() to break the dataset into two plots.
I was pleasantly surprised that adding facet_wrap() to my ggplot caused stat_summary() to correctly be applied to each subset of the data independently.

df>
| ID        | Group | Strain | Condition | DoublingTime    |
|-----------|-------|--------|-----------|-----------------|
| A_3g_Rep1 | A_3g  | A      | 3g        |     122.4135    |
| A_3g_Rep2 | A_3g  | A      | 3g        |     124.5801    |
| A_3g_Rep3 | A_3g  | A      | 3g        |     124.9419    |
| A_6g_Rep1 | A_6g  | A      | 6g        |     120.5004    |
| A_6g_Rep2 | A_6g  | A      | 6g        |     124.1666    |
| A_6g_Rep3 | A_6g  | A      | 6g        |     124.6453    |
| B_3g_Rep1 | B_3g  | B      | 3g        |     132.568     |
| B_3g_Rep2 | B_3g  | B      | 3g        |     137.5242    |
| B_3g_Rep3 | B_3g  | B      | 3g        |     135.5238    |
| B_6g_Rep1 | B_6g  | B      | 6g        |     137.1333    |
| B_6g_Rep2 | B_6g  | B      | 6g        |     142.733     |
| B_6g_Rep3 | B_6g  | B      | 6g        |     140.0722    |

First, I was using the following which correctly calculates mean and standard deviation values. However, it includes groups on the x-axis aren’t present in the facet.

DT_plotA <- ggplot(df, aes(Group, DoublingTime)) +
  stat_summary(fun.data="mean_sdl", fun.args = list(mult=1),
               geom="errorbar", width=0.5) +
  stat_summary(fun=mean, geom="point", size=3) +
  facet_wrap(nrow = 1, .~Strain)

DT_plotA

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

I was pleasantly surprised that adjusting my aes() x-value to Condition while including facet_wrap() caused stat_summary() to correctly calculate mean and standard deviation for each Group correctly.

DT_plotB <- ggplot(df, aes(Condition, DoublingTime)) +
  stat_summary(fun.data="mean_sdl", fun.args = list(mult=1),
               geom="errorbar", width=0.5) +
  stat_summary(fun=mean, geom="point", size=3) +
  facet_wrap(nrow = 1, .~Strain)

DT_plotB

However, if facet_wrap is removed from the plot, stat_summary calculates mean and standard deviation based on Condition: data from independent Strains is averaged. I worry that this caveat will be forgotten and lead to incorrect calculation of mean/sd when facet is removed.

DT_plotC <- ggplot(df, aes(Condition, DoublingTime)) +
  stat_summary(fun.data="mean_sdl", fun.args = list(mult=1),
               geom="errorbar", width=0.5) +
  stat_summary(fun=mean, geom="point", size=3)

DT_plotC

Question
Is there a way to generate a plot that looks like DT_plotB but instead includes aes(Group, DoublingTime) as in shown in the code for DT_plotA?

>Solution :

Maybe we could do it with some preprocessing of the data -> calculating the mean and sd:

library(dplyr)
library(ggplot2)

df %>% 
  group_by(Strain, Condition) %>% 
  mutate(mean = mean(DoublingTime),
         sd = sd(DoublingTime)) %>% 
  ggplot(aes(x = Condition, y=mean)) +
  geom_point()+
  geom_errorbar(aes(ymin = mean-sd, ymax = mean+sd), width=.2)+
  facet_wrap(.~Strain)

enter image description here

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading