Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Create a rule condition based on count and dates

I’d like to create a rule condition based on count and dates, for this, I try:

    # Package
    library(dplyr)

    # Open data set
    pred_avg<- read.csv("https://raw.githubusercontent.com/Leprechault/trash/main/cc_mean_CI.csv")
    str(pred_avg)
   
    # Create a rule if canopycoverSDmin < pred_avg, bad or good conditions
    pred_avg$class<-ifelse(pred_avg$canopycoverSDmin<pred_avg$covermin,"bad","good")
    pred_avg <-pred_avg[,-c(1,3:8)]
    #'data.frame':  17 obs. of  2 variables:
    #$ DATE : chr  "2021-06-04" "2021-06-14" "2021-06-24" "2021-07-04" ...
    #$ class: chr  "good" "good" "bad" "bad" ...

    # Now I'd like to create a decision if I have 3 "bad"s than attack, if not monitoring
    pred_avg_final <- pred_avg %>% group_by(DATE) %>% mutate(class = factor(class)) %>%
            count(class, name = "occurencies", .drop = F) %>%
            summarize(decision=ifelse(occurencies>=3,"attack","monitoring"))
    pred_avg_final
#      A tibble: 34 x 2
#      Groups:   DATE [17]
#       DATE       decision  
#       <chr>      <chr>     
#     1 2021-06-04 monitoring
#     2 2021-06-04 monitoring
#     3 2021-06-14 monitoring
#     4 2021-06-14 monitoring
#     5 2021-06-24 monitoring
#     6 2021-06-24 monitoring
#     7 2021-07-04 monitoring
#     8 2021-07-04 monitoring
#     9 2021-07-09 monitoring
#    10 2021-07-09 monitoring

But I have a problem that I don’t have success solving. I’d like to find any way to apply the condition ifelse(occurencies>=3,"attack","monitoring")
but just only for neighbourhood dates and not for non-continuous dates. For example, I have "bad" in 2021-06-24, 2021-07-04 and 2021-07-09 (continuos or neighbourhood dates), the decision in
the day 2021-07-09 is attack, for the other dates is monitoring just the end because I don’t have 3 "bad"s in neighbourhood dates again.

My deserible output is:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

#          DATE class decision
# 1  2021-06-04  good monitoring
# 2  2021-06-14  good monitoring
# 3  2021-06-24   bad monitoring
# 4  2021-07-04   bad monitoring
# 5  2021-07-09   bad attack
# 6  2021-07-19  good monitoring
# 7  2021-07-24  good monitoring
# 8  2021-08-03  good monitoring
# 9  2021-08-08  good monitoring
# 10 2021-08-13  good monitoring
# 11 2021-08-23   bad monitoring
# 12 2021-09-02  good monitoring
# 13 2021-09-07  good monitoring
# 14 2021-09-22   bad monitoring
# 15 2021-10-22   bad monitoring
# 16 2021-12-06  good monitoring
# 17 2021-12-26  good monitoring

Please, any help with it?

>Solution :

You can take a look at previous values with the function dplyr::lag().
Is this what you’re looking for?

pred_avg %>% 
  mutate(decision = ifelse(class == "bad" & lag(class, 1) == "bad" & lag(class, 2) == "bad", "attack", "monitoring"))
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading