I’m currently analzing a dataset and I need help with basic data preparation and showing data in a rather complex barplot. The diagram should look similar to the attached image, but ofcourse with different variables.
lets just use a mock data set for illustration:
df <- data.frame(id=c(1,2,3,4,5,6,7,8,9,10),
district=c("1","1","2","3","2","1","1","2","3","2"),
f1=c("1","2","3","1","2","3","1","2","2","3")
)
district = The city has 3 different districts
f1 = First question of the survey with 3 different categories
I want to show the percantage for each categorie per district and plot it similar to the plot in the image. First I want do display the overall percantage (for the city), and then per district. In the same plot!
I’m grateful for every help. Thanks alot
I want a plot similar to this one
>Solution :
As a first step this reuquires or at least I would recommend to compute the counts and percentages. After that it’s pretty straightforward to create a stacked barchart using ggplot2:
library(dplyr)
library(ggplot2)
df |>
count(district, f1) |>
mutate(pct = prop.table(n), .by = district) |>
ggplot(aes(pct, district, fill = f1)) +
geom_col()

EDIT One option to add the overall results would be to first "clone" your data and set the district equal to "city" or … and second to bind it to your original dataset using e.g. dplyr::bind_rows:
df <- df |>
mutate(district = "city") |>
bind_rows(df)
df |>
count(district, f1) |>
mutate(pct = prop.table(n), .by = district) |>
ggplot(aes(pct, district, fill = f1)) +
geom_col()
