I would like to summarise my data according to a group variable. And having for each variable the percentage of "yes" occuring. So for this example data:
df <- data.frame(group = c("a","a","a","a","b","b","b","b"),
female = c("yes","no","yes","no","yes","yes","yes","no"),
alcohol = c("yes","no",NA,"no","yes","yes","no","no"))
group female alcohol
1 a yes yes
2 a no no
3 a yes <NA>
4 a no no
5 b yes yes
6 b yes yes
7 b yes no
8 b no no
The result would be this : (the result could be or a ratio or a percentage)
> df_result
group female alcohol
1 a 0.50 0.33
2 b 0.75 0.50
I tried with tidyverse and the summarise function but the fact is that i have a lot of columns and naming them each would be really long
>Solution :
library(dplyr)
df %>%
summarise(across(everything(), ~mean(.=="yes", na.rm = TRUE)), .by=group)
group female alcohol
1 a 0.50 0.3333333
2 b 0.75 0.5000000