Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Add additional group in legend in ggplot2

I was trying to create a regression plot that shows the regression line for two subgroups and also the entire dataframe.

While doing that i stumbled across the question if it was possible to add a group to the that doesn’t exist in the dataframe to the legend (my variable only has two distinct groups, but I want to write three things in the legend).

For me specifically to add a legend for the regression with both groups combined. But I was also wondering in general.
Below you find some sample code.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

Every help is much appreciated!

#Load packages
library(MASS)
library(ggplot2)
library(dplyr)
#Set a seed
set.seed(1234)

#Create random dataframe
sigma1 <- rbind(c(1, 0.8), c(0.8, 1))
mu <- c(4.5, 3.2)
dta1 <- as.data.frame(
  mvrnorm(n = 1000, mu = mu, Sigma = sigma1)) |> 
  mutate(
    group = as.factor(sample(c(1), 1000, replace = TRUE))
)

sigma2 <- rbind(c(1, -0.5), c(-0.5, 1))
dta2 <- as.data.frame(
  mvrnorm(n = 1000, mu = mu, Sigma = sigma2)) |> 
  mutate(
  group = as.factor(sample(c(2), 1000, replace = TRUE))
)

dta <- rbind(dta1, dta2)

#Create the graphic
ggplot(dta, aes(x = V1, y = V2)) +
  geom_point(aes(color = group)) +
  geom_smooth(method = "lm", se = FALSE) +
  geom_smooth(method = "lm", se = FALSE, aes(color = group)) +
  scale_color_manual(name = "Legend", values = c("green", "orange"), labels = c("A", "B"))

>Solution :

Try this:

dta$group <- factor(dta$group,levels = c('1','2','3'))

ggplot(dta, aes(x = V1, y = V2)) +
  geom_point(aes(color = group)) +
  geom_smooth(method = "lm", se = FALSE) +
  geom_smooth(method = "lm", se = FALSE, aes(color = group)) +
  scale_color_manual(name = "Legend", 
                     values = c("green", "orange","blue"), 
                     labels = c("A", "B","Overall"),
                     drop = FALSE)

The strategy is to create a "dummy" unused factor level, and then manually label it the way you want. Note the need to include drop = FALSE in the scale, otherwise the unused factor level will be omitted.

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading