I have this dataframe
A <- c(100,101,102,98,97,93,96)
B <- c("John","Anne","John", "Anne","John","Anne","John")
C <- c("cheap", "cheap", "expensive", "cheap", "expensive", "cheap", "expensive")
D <- c("USA", "Mexico", "Mexico","USA", "Mexico","USA", "Mexico")
dataframe <- data.frame(A, B, C, D)
A B C D
1 100 John cheap USA
2 101 Anne cheap Mexico
3 102 John expensive Mexico
4 98 Anne cheap USA
5 97 John expensive Mexico
6 93 Anne cheap USA
7 96 John expensive Mexico
Imagine that I want to create on the same plot, different boxplots, grouping B, C and D columns.
So a total of 6 boxplots (John, Anne, cheap, expensive, USA and Mexico).
Taking into account the the group A values, of course.
The problem here is that each subgroup have a different total of samples to plot, which makes me very confused.
>Solution :
This problem is a matter of reshaping the data to long format. Then it becomes a standard boxplot.
library(ggplot2)
library(magrittr)
library(tidyr)
dataframe %>%
pivot_longer(-A) %>%
ggplot(aes(value, A)) +
geom_boxplot()
