I’d like to create a rule condition based on count and dates, for this, I try:
# Package
library(dplyr)
# Open data set
pred_avg<- read.csv("https://raw.githubusercontent.com/Leprechault/trash/main/cc_mean_CI.csv")
str(pred_avg)
# Create a rule if canopycoverSDmin < pred_avg, bad or good conditions
pred_avg$class<-ifelse(pred_avg$canopycoverSDmin<pred_avg$covermin,"bad","good")
pred_avg <-pred_avg[,-c(1,3:8)]
#'data.frame': 17 obs. of 2 variables:
#$ DATE : chr "2021-06-04" "2021-06-14" "2021-06-24" "2021-07-04" ...
#$ class: chr "good" "good" "bad" "bad" ...
# Now I'd like to create a decision if I have 3 "bad"s than attack, if not monitoring
pred_avg_final <- pred_avg %>% group_by(DATE) %>% mutate(class = factor(class)) %>%
count(class, name = "occurencies", .drop = F) %>%
summarize(decision=ifelse(occurencies>=3,"attack","monitoring"))
pred_avg_final
# A tibble: 34 x 2
# Groups: DATE [17]
# DATE decision
# <chr> <chr>
# 1 2021-06-04 monitoring
# 2 2021-06-04 monitoring
# 3 2021-06-14 monitoring
# 4 2021-06-14 monitoring
# 5 2021-06-24 monitoring
# 6 2021-06-24 monitoring
# 7 2021-07-04 monitoring
# 8 2021-07-04 monitoring
# 9 2021-07-09 monitoring
# 10 2021-07-09 monitoring
But I have a problem that I don’t have success solving. I’d like to find any way to apply the condition ifelse(occurencies>=3,"attack","monitoring")
but just only for neighbourhood dates and not for non-continuous dates. For example, I have "bad" in 2021-06-24, 2021-07-04 and 2021-07-09 (continuos or neighbourhood dates), the decision in
the day 2021-07-09 is attack, for the other dates is monitoring just the end because I don’t have 3 "bad"s in neighbourhood dates again.
My deserible output is:
# DATE class decision
# 1 2021-06-04 good monitoring
# 2 2021-06-14 good monitoring
# 3 2021-06-24 bad monitoring
# 4 2021-07-04 bad monitoring
# 5 2021-07-09 bad attack
# 6 2021-07-19 good monitoring
# 7 2021-07-24 good monitoring
# 8 2021-08-03 good monitoring
# 9 2021-08-08 good monitoring
# 10 2021-08-13 good monitoring
# 11 2021-08-23 bad monitoring
# 12 2021-09-02 good monitoring
# 13 2021-09-07 good monitoring
# 14 2021-09-22 bad monitoring
# 15 2021-10-22 bad monitoring
# 16 2021-12-06 good monitoring
# 17 2021-12-26 good monitoring
Please, any help with it?
>Solution :
You can take a look at previous values with the function dplyr::lag().
Is this what you’re looking for?
pred_avg %>%
mutate(decision = ifelse(class == "bad" & lag(class, 1) == "bad" & lag(class, 2) == "bad", "attack", "monitoring"))