So, this is the data that I have:
x <- seq(-5, 5, length=1000)
y1 <- dnorm(x, 0, 0.2)
y2 <- dnorm(x, 0, 1)
data1 <- data.frame(x, y1, y2)
I generate a graph with an example of the distribution of two variables, and I generate the graph like so:
p <- data1 %>% ggplot()
p + geom_line(aes(x, y1), linetype = 2) + geom_line(aes(x, y2)) + geom_line(aes(x, y= 0)) +
geom_text(aes(x = 0, y = -0.05, label = "Party A"), nudge_x = 0.5) +
geom_text(aes(x = 0, y = -0.05, label = "Party B"), nudge_x = - 0.5) +
xlab("Ideological scale: Left to Right") +
ylab("Prabability")
I am trying to add the legend showing that the dashed line is the distribution of party A and the solid line is the distribution of party B. How can I add the legend?
>Solution :
ggplot2
really prefers "long data". I’ll use tidyr::pivot_longer
here (as well as dplyr
for some massaging) to convert from wide to long, but it can also easily be done with data.table
or (with a touch more work) base R.
A quick demo of wide-to-long:
library(tidyr)
library(dplyr)
pivot_longer(data1, -x, names_to = "y", values_to = "val") |>
mutate(lbl = if_else(y == "y1", "Party A", "Party B"))
# # A tibble: 2,000 × 4
# x y val lbl
# <dbl> <chr> <dbl> <chr>
# 1 -5 y1 3.83e-136 Party A
# 2 -5 y2 1.49e- 6 Party B
# 3 -4.99 y1 1.34e-135 Party A
# 4 -4.99 y2 1.56e- 6 Party B
# 5 -4.98 y1 4.65e-135 Party A
# 6 -4.98 y2 1.64e- 6 Party B
# 7 -4.97 y1 1.62e-134 Party A
# 8 -4.97 y2 1.73e- 6 Party B
# 9 -4.96 y1 5.59e-134 Party A
# 10 -4.96 y2 1.81e- 6 Party B
# # ℹ 1,990 more rows
# # ℹ Use `print(n = ...)` to see more rows
This allows us to have simpler geom_*
calls, typically down to a single. I’ll also infer that we can replace your geom_line(aes(x, y=0))
to geom_hline
, in order to add the horizontal line.
library(ggplot2)
pivot_longer(data1, -x, names_to = "y", values_to = "val") |>
mutate(lbl = if_else(y == "y1", "Party A", "Party B")) |>
ggplot(aes(x, val)) +
geom_line(aes(group = y, linetype = lbl)) +
geom_hline(yintercept = 0) +
geom_text(
aes(label = lbl),
data = ~ filter(., val > max(val)/2, .by = y) %>% slice_head(n = 1, by = lbl),
hjust = 1.2
) +
scale_linetype_manual(values = c("Party A"=2, "Party B"=1))
I also add the label programmatically by updating the data=
inside the geom_text
call, using the ~
-style "function" expression.