I created a boxplot for a data set and did not use the geom_jitter function. Still there are dots inside the plot. Are those statistical values or why are they appearing?
I attached the code I use below.
pacman::p_load(tidyverse, readxl, janitor, emmeans, multcomp, magrittr,
parameters, effectsize, multcompView, see, performance,
conflicted, ggpubr, rstatix)
conflict_prefer("select", "dplyr")
conflict_prefer("filter", "dplyr")
conflict_prefer("summarise", "dplyr")
conflict_prefer("extract", "magrittr")
cbPalette <- c("#999999", "#E69F00", "#56B4E9", "#009E73", "#F0E442", "#0072B2", "#D55E00", "#CC79A7") ## Color blind friendly palette
##--------------------------------------------------------------------------------------------------------------------------
## Funktion um Excel Datei mit mehreren Sheets zu öffnen und eines davon auszuwählen
library(readxl)
read_excel_allsheets <- function(filename) {
sheets <- readxl::excel_sheets(filename)
x <- lapply(sheets, function(x) readxl::read_excel(filename, sheet = x))
return(x)
}
big_tbl <- read_excel_allsheets ("Mesocosms_R.xlsx")
big_tbl
phyto_plankton_tbl<- big_tbl[[14]]
##--------------------------------------------------------------------------------------------------------------------------
## Data transformation
phyto_plankton_tbl %>%
mutate(
block = as.factor(block),
trt = factor(trt, labels = c("-P&-F", "+P/-F", "+P/+F", "-P/+F")))
phyto_plankton_tbl <- phyto_plankton_tbl %>%
gather(key = "time", value = "PelaChl", t0, t1, t2, t3, t4, t5) %>% ## Ändert Tabelle aus width format into long format
convert_as_factor(trt, time)
print(phyto_plankton_tbl, n = 40)
##--------------------------------------------------------------------------------------------------------------------------
## Visualization
pelaChl_bxp <- ggplot(data = phyto_plankton_tbl, aes(x= time, y = PelaChl, fill = trt)) +
geom_boxplot() +
ylim(0, 50) +
scale_fill_manual(values=cbPalette) + ## Adds color blind firendly palette
## geom_jitter() +
theme_bw()
>Solution :
From the documentation:
The boxplot compactly displays the distribution of a continuous variable. It visualises five summary statistics (the median, two hinges and two whiskers), and all "outlying" points individually.
The individual points you are seeing are outliers.