Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

ggplot2 histogram with ratios and facet_wrap

I am having issues with ggplot2 geom_histogram when plotting frequencies and using facet_wrap at the same time:

#
myTestDF<- data.frame(
  Sample = as.vector(replicate(n = 6, expr = c('s1', 's2', 's3'))),
  var2 = c(
    replicate(n = 9, expr = 't1'),
    replicate(n = 9, expr = 't2')),
  Val1 = c(
    replicate(n = 3, expr = c(2,20,40)),
    replicate(n = 3, expr = c(0.2,0.4,0.6))),
  stringsAsFactors = FALSE)
myTestDF<- rbind(myTestDF, data.frame(Sample = 's2', var2 = 't1', Val1 = 70,
                                      stringsAsFactors = FALSE)) ##afterthought :)
myTestDF$var3<- paste(myTestDF$Sample, myTestDF$var2, sep = '_')

###Now, this works:

ggplot(
  data = myTestDF[myTestDF$Sample=='s2',],
  aes(x = Val1, fill = var2)) +
  geom_histogram(
    aes(y = after_stat(c(
      count[group==1]/sum(count[group==1]),
      count[group==2]/sum(count[group==2])))),
    position = "identity", alpha = 0.5) +
  labs(title = 'testHist',
       x = "Val1", y = "Frequency") +
  theme_minimal() 

#

However, I can’t figure a way to make it work with facet_wrap by ‘Sample’. The frequencies get all messed up, and after experimenting and reading around, I can’t find a way to do it. Of course I can do a for loop , but I would like to understand if it can be done with facet_wrap or other ggplot2 function. Looking forward to your feedback.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

>Solution :

The issue is that you do not account for the panels when computing the relative frequencies separately for each group, i.e. when you use facet_wrap the data is ordered first by PANEL and second by group. Instead I would suggest to use e.g. ave() to compute the relative frequencies. In the code below I also added the PANEL as a second grouping variable.

library(ggplot2)

ggplot(
  myTestDF, aes(x = Val1, fill = var2)
) +
  geom_histogram(
    aes(y = after_stat(
      ave(count, group, PANEL, FUN = \(x) x / sum(x))
    )),
    position = "identity", alpha = 0.5
  ) +
  labs(
    title = "testHist",
    x = "Val1", y = "Frequency"
  ) +
  theme_minimal() +
  facet_wrap(~Sample)
#> `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading