How include zeros when using count() + dplyr

November 23, 2023

The code I’ve got so far works fine, but I want to include in the output certain Majors with zero count. Having read around it looks like the solution is to include .drop = FALSE within count(), but I can’t get it to work.

#DATA
Sum22_Graduation_MajorsALL <- data.frame(Sum_2022_Graduation.Major_1 =c('CRJS', 'CRJS', 
'ENGL', 'ENGL', 'JOE', 'DAN', 'HIST', 'PPE'), Sum_2022_Graduation.Major_2 =c('JOE', 'DAN', 'ENGL', 
'HIST', 'PPE', 'CRJS', 'CRJS', 'PPE'))

#CODE SO FAR
Sum22_selectCOHSSgrad <- Sum22_Graduation_MajorsALL %>%
    select(Sum_2022_Graduation.Major_1, Sum_2022_Graduation.Major_2) %>%
    pivot_longer(cols = everything(), names_to = NULL, values_to = 'Majors') %>%
    filter(Majors=='CRJS' | Majors=='ENGL' | Majors=='HIST' | Majors=='POLS' | Majors=='PPE') %>%
    count(Majors, name = "Count")

But because POLS does not occur in Sum22_Graduation_MajorsALL, the output just doesn’t include POLS at all–whereas I would like it to include 'POLS'.......0. The documentation for dplyr seems to say that count(Majors, name = "Count", .drop = FALSE) should accomplish. I’m obviously using this incorrectly, but can someone kindly point out where is my error?

Thanks and happy thanksgiving!

>Solution :

The docs for dplyr::count() note that arguments can be passed through the ellipsis to dplyr::group_by() which has the .drop parameter:

.drop Drop groups formed by factor levels that don’t appear in the data? The default is TRUE

In your case, make Majors a factor, specifying the levels explicitly. Then set .drop = FALSE in your count() call.

subjects_of_interest <- c("CRJS", "ENGL", "HIST", "POLS", "PPE")

Sum22_Graduation_MajorsALL |>
    select(Sum_2022_Graduation.Major_1, Sum_2022_Graduation.Major_2) |>
    pivot_longer(cols = everything(), names_to = NULL, values_to = "Majors") |>
    filter(Majors %in% subjects_of_interest) |>
    mutate(Majors = factor(Majors, levels = subjects_of_interest)) |>
    count(Majors, name = "Count", .drop = FALSE)

# # A tibble: 5 × 2
#   Majors Count
#   <fct>  <int>
# 1 CRJS       4
# 2 ENGL       3
# 3 HIST       2
# 4 POLS       0
# 5 PPE        3