how to create a count variable that counts based on the value of a binary variable in R using dplyr

code to create df: I have the repeat data that looks like the table below.

df <- structure(list(patid = c("1", "1", "1", "1", "2", "2", "3", "3", 
                         "3", "4", "4", "4", "4"), observation_date = c("07/07/2016", 
                                                                        "07/08/2016", "07/11/2016", "07/07/2019", "07/05/2015", "02/12/2016", 
                                                                        "07/05/2015", "07/06/2015", "16/06/2015", "07/05/2015", "02/12/2016", 
                                                                        "18/12/2016", "15/01/2017"),
                      registration = c("0","0","1","1","0","1","0","0","0","0","1","1","1")), class = "data.frame", row.names = c(NA, 
                                                                                                                                          -13L))
patid observation_date registration
1 07/07/2016 0
1 07/08/2016 0
1 07/11/2016 1
1 07/07/2019 1
2 07/05/2015 0
2 02/12/2016 1
3 07/05/2015 0
3 07/06/2015 0
3 16/06/2015 0
4 07/05/2015 0
4 02/12/2016 1
4 18/12/2016 1
4 15/01/2017 1

I would like to summarize data in this way: I want two rows only for each patid, one row will have a count of the observation_date when value of registration=0 and one row will have a count of the observation_date when value of registration=1. This is the code I tried but it is only giving a count of all observation_date per patid

cons_sum <- df%>%
         group_by(patid) %>%
         arrange(observation_date)%>%
         tally()

The final table should look like this:

patid registration count
1 0 2
1 1 2
2 0 1
2 1 1
3 0 4
3 1 0
4 0 1
4 1 3

>Solution :

Use count. To make every possible value appear in the final table, you should convert your registration column to a factor:

df %>% 
  count(patid, registration = factor(registration), .drop = FALSE)

output

  patid registration n
1     1            0 2
2     1            1 2
3     2            0 1
4     2            1 1
5     3            0 3
6     3            1 0
7     4            0 1
8     4            1 3

Leave a Reply