Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

grid of density plots with reference data plotted in each group

Let’s say I have a data frame:

df = data.frame(var = c("a", "a", "b", "b", "c", "c", "a", "a", "b", "b", "c", "c", "a", "a", "b",  "b", "c", "c"),
                source = c("ref", "ref", "ref", "ref", "ref", "ref", "source1", "source1", "source1", "source1", "source1", "source1", "source2", "source2", "source2", "source2", "source2", "source2"),
                value = c(2.5, 1, 3.5, 1.6, 2.2, 3.1, 2, 1.2, 1.8, 0.4, 1.4, 1.3, 3, 2.8, 4, 3.6, 2.9, 3.8))

> df
   var  source value
1    a     ref   2.5
2    a     ref   1.0
3    b     ref   3.5
4    b     ref   1.6
5    c     ref   2.2
6    c     ref   3.1
7    a source1   2.0
8    a source1   1.2
9    b source1   1.8
10   b source1   0.4
11   c source1   1.4
12   c source1   1.3
13   a source2   3.0
14   a source2   2.8
15   b source2   4.0
16   b source2   3.6
17   c source2   2.9
18   c source2   3.8

and I would like to generate density plots for value for each var / source pair. That works with:

library(tidyverse)
library(ggplot2)

df %>%
  ggplot(aes(x = value)) +
  geom_density(aes(y = ..density.., fill = source), adjust = 1, alpha = 0.5) +
  facet_grid(source ~ var, scales = "fixed") +
  theme_bw()

producing:
enter image description here

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

But what I really want, based on this example, is to have only two rows, corresponding to source1 and source2 and add another density curve in each of the plots based on the values from ref.

I tried to find a solution following this post but I did not succeed.
In other words, I would like that for each plot in the grid to have the distribution of the values in ref as reference, and the ref group to not be taken into account in the plot legend.

Any help is highly appreciated. Thank you.

>Solution :

One option would be to split your dataframe in two, one containing the reference values, one containing the others. For the df containing the reference values we also have to drop the source column. Then make use of two geom_density. Removing the reference from the legend is not a big deal. Simply remove the fill aes and set your desired fill color if any as a parameter. In my code below I have simply set fill=NA.

library(ggplot2)

df1 <- df[df$source == "ref", -2]
df2 <- df[!df$source == "ref", ]

ggplot(mapping = aes(x = value)) +
  geom_density(data = df1, aes(y = ..density..), fill = NA, adjust = 1, alpha = 0.5) +
  geom_density(data = df2, aes(y = ..density.., fill = source), adjust = 1, alpha = 0.5) +
  facet_grid(source ~ var, scales = "fixed") +
  theme_bw()

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading