Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How include zeros when using count() + dplyr

The code I’ve got so far works fine, but I want to include in the output certain Majors with zero count. Having read around it looks like the solution is to include .drop = FALSE within count(), but I can’t get it to work.

#DATA
Sum22_Graduation_MajorsALL <- data.frame(Sum_2022_Graduation.Major_1 =c('CRJS', 'CRJS', 
'ENGL', 'ENGL', 'JOE', 'DAN', 'HIST', 'PPE'), Sum_2022_Graduation.Major_2 =c('JOE', 'DAN', 'ENGL', 
'HIST', 'PPE', 'CRJS', 'CRJS', 'PPE'))
#CODE SO FAR
Sum22_selectCOHSSgrad <- Sum22_Graduation_MajorsALL %>%
    select(Sum_2022_Graduation.Major_1, Sum_2022_Graduation.Major_2) %>%
    pivot_longer(cols = everything(), names_to = NULL, values_to = 'Majors') %>%
    filter(Majors=='CRJS' | Majors=='ENGL' | Majors=='HIST' | Majors=='POLS' | Majors=='PPE') %>%
    count(Majors, name = "Count") 

But because POLS does not occur in Sum22_Graduation_MajorsALL, the output just doesn’t include POLS at all–whereas I would like it to include 'POLS'.......0. The documentation for dplyr seems to say that count(Majors, name = "Count", .drop = FALSE) should accomplish. I’m obviously using this incorrectly, but can someone kindly point out where is my error?

Thanks and happy thanksgiving!

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

>Solution :

The docs for dplyr::count() note that arguments can be passed through the ellipsis to dplyr::group_by() which has the .drop parameter:

.drop Drop groups formed by factor levels that don’t appear in the data? The default is TRUE

In your case, make Majors a factor, specifying the levels explicitly. Then set .drop = FALSE in your count() call.

subjects_of_interest <- c("CRJS", "ENGL", "HIST", "POLS", "PPE")

Sum22_Graduation_MajorsALL |>
    select(Sum_2022_Graduation.Major_1, Sum_2022_Graduation.Major_2) |>
    pivot_longer(cols = everything(), names_to = NULL, values_to = "Majors") |>
    filter(Majors %in% subjects_of_interest) |>
    mutate(Majors = factor(Majors, levels = subjects_of_interest)) |>
    count(Majors, name = "Count", .drop = FALSE)

# # A tibble: 5 × 2
#   Majors Count
#   <fct>  <int>
# 1 CRJS       4
# 2 ENGL       3
# 3 HIST       2
# 4 POLS       0
# 5 PPE        3
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading