Summarise columns into percentage of a certain value

December 13, 2023

I would like to summarise my data according to a group variable. And having for each variable the percentage of "yes" occuring. So for this example data:

df <- data.frame(group = c("a","a","a","a","b","b","b","b"),
                 female = c("yes","no","yes","no","yes","yes","yes","no"),
                 alcohol = c("yes","no",NA,"no","yes","yes","no","no"))

  group female alcohol
1     a    yes     yes
2     a     no      no
3     a    yes    <NA>
4     a     no      no
5     b    yes     yes
6     b    yes     yes
7     b    yes      no
8     b     no      no

The result would be this : (the result could be or a ratio or a percentage)

> df_result
  group female alcohol
1     a   0.50    0.33
2     b   0.75    0.50

I tried with tidyverse and the summarise function but the fact is that i have a lot of columns and naming them each would be really long

>Solution :

library(dplyr)
df %>% 
  summarise(across(everything(), ~mean(.=="yes", na.rm =  TRUE)), .by=group)
  group female   alcohol
1     a   0.50 0.3333333
2     b   0.75 0.5000000