I want to create a new column based on the count of scores that are >= 3 according to the date.
For example, for dates that have multiple occurrences such as 2020-12-07, there is only 1 score that is >= 3, so the value for the new column num_greater_than_3 is 1 for both of the 2020-12-07 dates.
original:
ID <- c(1,1,1,1,1,1,1,1,1)
date <- c("2020-12-02","2020-12-03","2020-12-04","2020-12-05","2020-12-06","2020-12-07","2020-12-07,"2020-12-08","2020-12-08)
score <- c(2,5,2,2,3,3,1,1,3)
a <- data.frame(ID,date,score)
desired:
ID <- c(1,1,1,1,1,1,1,1,1)
date <- c("2020-12-02","2020-12-03","2020-12-04","2020-12-05","2020-12-06","2020-12-07","2020-12-07,"2020-12-08","2020-12-08)
score <- c(2,5,2,2,3,3,1,1,3)
num_score_greater_than_3 <- (0,1,0,0,1,1,1,1,1)
a <- data.frame(ID,date,score,num_score_greater_than_3)
I appreciate all the help there is! Thanks!
>Solution :
I assume you want greater than or equal to 3:
library(dplyr)
a %>%
group_by(date) %>%
mutate(num_score_greater_than_3=sum(score>=3))
ID date score num_score_greater_than_3
<dbl> <fct> <dbl> <int>
1 1 2020-12-02 2 0
2 1 2020-12-03 5 1
3 1 2020-12-04 2 0
4 1 2020-12-05 2 0
5 1 2020-12-06 3 1
6 1 2020-12-07 3 1
7 1 2020-12-07 1 1
8 1 2020-12-08 1 1
9 1 2020-12-08 3 1