Plot smooth line for multiple grouped gene expression data over a development course

Is there a way to plot multiple smooth lines connecting the group averages of expression values from different genes in 5 consecutive development groups?
I can only plot one smooth line if I filter for one gene but cant make it work for all genes at once.

# package
library('tidyverse') # ggplot & dplyr
library('magrittr') # pipe operations

## generate data
set.seed(123)
num_genes <- 5
num_groups <- 5
exp <- data.frame()
for (gene_id in 1:num_genes) {
  gene_name <- paste("Gene", gene_id, sep = "_")
  for (group_id in 1:num_groups) {
    group_name <- paste("Stage", group_id, sep = "_")
    expression_values <- rnorm(10, mean = 10, sd = 2)  # Change 10 to your desired sample size
    group_data <- data.frame(Gene = gene_name, Group = group_name, Expression = expression_values)
    exp <- rbind(exp, group_data)
  }
}
exp$Group <- factor(exp$Group, c('Stage_1',  'Stage_2', 'Stage_3', 'Stage_4', 'Stage_5'))

## plots

# one nice line for one gene but cant combine multiple into one plot
exp %>%
  filter(Gene == 'Gene_1') %>%
  ggplot(aes(x=Group, y=Expression, group = 1)) +
  geom_point() +
  geom_smooth()

# only one smooth line instead of a line for each Gene
exp %>%
  ggplot(aes(x=Group, y=Expression, color = Gene, group = 1)) +
  geom_point() +
  geom_smooth()

# no line at all
exp %>%
  ggplot(aes(x=Group, y=Expression, color = Gene, group = Group)) +
  geom_point() +
  geom_smooth()

>Solution :

As your x axis variable is a discrete variable you have to explicitly set the group aes for geom_smooth. And as you want a line for each gene you have to use group=Gene:

library(ggplot2)

ggplot(exp, aes(x = Group, y = Expression, color = Gene, group = Gene)) +
  geom_point() +
  geom_smooth()

enter image description here

Leave a Reply