Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How to access group ids for grouped filter operation?

I have a data frame containing dates and associated values for three groups as follows.

library(lubridate)
library(magrittr)
library(dplyr)

data <-
  data.frame(group = rep(c("a", "b", "c"), each = 5),
             date = rep(seq(ymd(20200101), ymd(20200105), by = 1),
                        times = 3),
             value = runif(15))

I would like to perform a grouped filter operation in order to extract subsets of the data by date, where the start date varies by group.

start_date <- list(a = ymd(20200102), b = ymd(20200104), c = ymd(20200103))

I’d like to use the group name to index into the start date list. I try to do the operation as follows, but I get an error message.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

data %>%
  group_by(group) %>%
  filter(date >= start_date[[group]]) 
Error in app$vspace(new_style$`margin-top` %||% 0) :
  attempt to apply non-function

What am I doing wrong?

(The alternative to the above procedure is to set the start dates up as a data frame, join the data frame to data, and perform an ungrouped filter. This works as intended, but it’s less elegant (in my opinion) and I’d prefer to keep the start dates as a list for other reasons.)

start_date_2 <- 
  data.frame(group = c("a", "b", "c"), 
             start_date = c(ymd(20200102), ymd(20200104), c = ymd(20200103)))

data %>%
  left_join(start_date_2, by = "group") %>%
  filter(date >= start_date) %>%
  select(-start_date)

>Solution :

We can access the name of the current group(s) by using cur_group(). This gives us a tibble with every group being a column containing the current group name, so we need to subset it with $group.

library(dplyr)

data %>%
  group_by(group) %>%
  filter(date >= start_date[[cur_group()$group]])

#> # A tibble: 9 x 3
#> # Groups:   group [3]
#>   group date        value
#>   <chr> <date>      <dbl>
#> 1 a     2020-01-02 0.225 
#> 2 a     2020-01-03 0.345 
#> 3 a     2020-01-04 0.110 
#> 4 a     2020-01-05 0.0951
#> 5 b     2020-01-04 0.356 
#> 6 b     2020-01-05 0.345 
#> 7 c     2020-01-03 0.0973
#> 8 c     2020-01-04 0.344 
#> 9 c     2020-01-05 0.418

Created on 2022-03-29 by the reprex package (v0.3.0)

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading