Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How can I create a stacked barchart with this data using ggplot2?

This is the data I’m working with:

Station Salinity CentricD PennateD Dinoflag MarineFlag Cilliates
A3 18.3 181000 26500 1000 15500 2250
A6 27.4 584666.6667 4666.666667 11666.66667 0 61333.33333
A8 25.7 625071.4286 2000 74000 294.1176471 1907.563025
B 29.77785714 503693.8776 2000 6642.857143 7642.857143 5622.44898
C 31.283 266991.5966 5285.714286 10714.28571 71352.94118 12067.22689
D 32.21625 349375 6437.5 6142.857143 39651.78571 4339.285714
E 32.23 379200 466.6666667 3714.285714 12228.57143 4504.761905
F 32.8 559000 0 333.3333333 0 11000
G 33.185 209276.7857 2125 5714.285714 27937.5 3062.5
H 33.67 98714.28571 1812.5 7125 6410.714286 7750
I 34.33294118 113302.521 1764.705882 40142.85714 5588.235294 9260.504202
J 34.537 68142.85714 1000 12842.85714 20228.57143 5271.428571

I want to make a stacked barchart, with ‘Station’ on the x-axis, and then each type of phytoplankton stacked on top of each other per station to create a comprehensive idea of both how many phytoplankton there are per station and what that composition is made up of.
I just don’t know how to do that. Looking at the geom_bar() command, I need to specify a ‘fill’ variable, of which I don’t have just one, I have 5 types of phytoplankton that I want to fill it with.

I’m sure that this is just a data formatting issue, but I can’t find any examples of how to properly format it. Thanks in advance.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

>Solution :

You would first have to pivot the data to be in long format, then you could make the graph using the pivoted values as the y-axis values and the pivoted variable names as the fill variable. Here’s an example.

Original Data

library(dplyr)
library(tidyr)
library(ggplot2)
dat <- tibble::tribble(
  ~Station , ~Salinity    , ~CentricD    , ~PennateD    , ~Dinoflag    , ~MarineFlag  , ~Cilliates   ,
"A3"      , 18.3        , 181000      , 26500       , 1000        , 15500       , 2250        ,
"A6"      , 27.4        , 584666.6667 , 4666.666667 , 11666.66667 , 0           , 61333.33333 ,
"A8"      , 25.7        , 625071.4286 , 2000        , 74000       , 294.1176471 , 1907.563025 ,
"B"       , 29.77785714 , 503693.8776 , 2000        , 6642.857143 , 7642.857143 , 5622.44898  ,
"C"       , 31.283      , 266991.5966 , 5285.714286 , 10714.28571 , 71352.94118 , 12067.22689 ,
"D"       , 32.21625    , 349375      , 6437.5      , 6142.857143 , 39651.78571 , 4339.285714 ,
"E"       , 32.23       , 379200      , 466.6666667 , 3714.285714 , 12228.57143 , 4504.761905 ,
"F"       , 32.8        , 559000      , 0           , 333.3333333 , 0           , 11000       ,
"G"       , 33.185      , 209276.7857 , 2125        , 5714.285714 , 27937.5     , 3062.5      ,
"H"       , 33.67       , 98714.28571 , 1812.5      , 7125        , 6410.714286 , 7750        ,
"I"       , 34.33294118 , 113302.521  , 1764.705882 , 40142.85714 , 5588.235294 , 9260.504202 ,
"J"       , 34.537      , 68142.85714 , 1000        , 12842.85714 , 20228.57143 , 5271.428571 )

Here, I use pivot_longer() from tidyr. This will plot the raw values by station and phytoplankton. Note, that if you are providing the y value directly (and not calculating it from the data), you need to use stat="identity" in geom_bar().

dat %>% 
  pivot_longer(CentricD:Cilliates, names_to = "phyto", values_to = "val") %>% 
  ggplot(aes(x=Station, y=val, fill = phyto)) + 
  geom_bar(stat="identity") + 
  theme_bw()

If you would rather percentagize the figures so each bar has the same height, you could make the percentage variable by Station first and then plot that variable instead.


dat %>% 
  pivot_longer(CentricD:Cilliates, names_to = "phyto", values_to = "val") %>% 
  group_by(Station) %>% 
  mutate(pct = val/sum(val)) %>% 
  ggplot(aes(x=Station, y=pct, fill = phyto)) + 
  geom_bar(stat="identity") + 
  theme_bw()

Created on 2024-12-04 with reprex v2.1.0

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading