Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

R GGplot histograms and density with multiple fill variables

I want to create a plot with three overlapping histograms (each with a different color and semi-transparent) with their density polygons overlaid on the same graph without having the histograms automatically "stack" bins corresponding to different factors. I want the histograms to "wrap", meaning that the density of each factor is calculated independently but I want them all in the same plot so we can clearly compare the distribution (hence no facet_wrap!). I can get this to work by writing three separate lines calling the histograms individually but when I have many factor levels this becomes a pain. I also tried grouping the data by the factor but the resulting graph is still stacked. The reason I want to do this, is I want to examine many individuals and their distributions and try to identify weird individuals or subgroups that may have a different pattern so I do really want all of the graphs stacked on top of each other.

library(ggplot2)

# Simulate some data
color = c("yellow", "brown", "green")
n_per_factor = c(45, 50, 48)
flies = c(rgamma(n_per_factor[1], shape = 40, scale = 3.5),
          rgamma(n_per_factor[2], shape = 40, scale = 3),
          rgamma(n_per_factor[3], shape = 40, scale = 2.5))
bananas = data.frame(color = rep.int(color, times = n_per_factor),
                     flies = flies)

# Create histogram (is stacked don't like)
ggplot(data = bananas, aes(x = flies, after_stat(density), 
                           fill = color))+
  geom_histogram(alpha = 0.50)+
  geom_density(alpha = 0.50)+
  xlab("Flies per Banana")+
  ylab("Density")

# Create histogram (not stacked, want but don't want so many lines)
ggplot(data = bananas, aes(x = flies, after_stat(density), 
                           fill = color))+
  geom_histogram(data = bananas %>% filter(color == "yellow"),
                 aes(x = flies, after_stat(density), 
                     fill = color), alpha = 0.50)+
  geom_histogram(data = bananas %>% filter(color == "green"),
                 aes(x = flies, after_stat(density), 
                     fill = color), alpha = 0.50)+
  geom_histogram(data = bananas %>% filter(color == "brown"),
                 aes(x = flies, after_stat(density), 
                     fill = color), alpha = 0.50)+
  geom_density(alpha = 0.50)+
  xlab("Flies per Banana")+
  ylab("Density")

# Tried but failed
ggplot(data = bananas, aes(x = flies, after_stat(density), 
                           group = color, fill = color))+
  geom_histogram(alpha = 0.50)+
  geom_density(alpha = 0.50)+
  xlab("Flies per Banana")+
  ylab("Density")

Unhappy_stacked_histo

Happy_stacked_histo

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

>Solution :

We can use position = "identity" in geom_histogram() and geom_density()
The histograms and densities will then be overlaid on top of each other.

library(ggplot2)

ggplot(data = bananas, aes(x = flies, fill = color, group = color)) +
  geom_histogram(aes(y = after_stat(density)), alpha = 0.5, position = "identity", bins = 30) +
  geom_density(alpha = 0.5, position = "identity") +
  scale_fill_manual(values = c("brown", "green", "yellow"))+
  #scale_fill_identity(guide = 'legend') + #instead of scale_fill_manual...
  xlab("Flies per Banana") +
  ylab("Density")

enter image description here

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading