Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Dplyr – Get Running average per entity based on a filter

dput of the input dataframe :

structure(list(Entity = c("A", "A", "A", "A", "A", "A", "A", 
"A", "A", "A", "B", "B", "B", "B", "B", "B", "B", "B", "B", "B"
), Date = structure(c(1640995200, 1641081600, 1641168000, 1641254400, 
1641340800, 1641427200, 1641513600, 1641600000, 1641686400, 1641772800, 
1640995200, 1641081600, 1641168000, 1641254400, 1641340800, 1641427200, 
1641513600, 1641600000, 1641686400, 1641772800), tzone = "UTC", class = c("POSIXct", 
"POSIXt")), Test = c("Y", "Y", "N", "Y", "N", "N", "Y", "Y", 
"Y", "Y", "Y", "Y", "N", "Y", "N", "N", "Y", "Y", "Y", "Y"), 
    Value = c(5, 10, 5, 10, 10, 5, 5, 5, 5, 20, 10, 10, 5, 20, 
    20, 5, 5, 20, 20, 20), COUNTER = c(1L, 2L, 3L, 4L, 5L, 6L, 
    7L, 8L, 9L, 10L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L)), class = c("tbl_df", 
"tbl", "data.frame"), row.names = c(NA, -20L))

I would like to get a final dataframe that shows running 3 day average per entity where Test = 'Y'. For ex., based on the below screenshot,

enter image description here

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

the running averages for the entity A would be:

Entity   Counter_Running_avg     Running_Avg

A         1                     7.5 (15/2)

A         2                     10 (10/1)

A         3                     5   (15/3)    and so on..

I began by writing the below code but this is not what I need…

dt %>%      
   arrange(Entity, Date) %>%
   group_by(Entity) %>%
   filter(Test = 'Y') %>%       
   summarise(Avg = mean(head(Value, 3), na.rm = TRUE))

>Solution :

df %>%
  group_by(Entity,id = (COUNTER - 1) %/%3 + 1) %>%
  summarise(Running_Avg = mean(Value[Test == 'Y']), .groups = 'drop')
  

# Groups:   Entity [2]
  Entity    id Running_Avg
  <chr>  <dbl>       <dbl>
1 A          1         7.5
2 A          2        10  
3 A          3         5  
4 A          4        20  
5 B          1        10  
6 B          2        20  
7 B          3        15  
8 B          4        20  
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading