Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Divide different groups by reference group

I have an almost identical problem with this answered question here: Divide different groups by reference group

I’m having this df, only with more grouping variables(no result column):

df <- data.frame(pop= c(1,1,1,1,1,1,1,1,2,2,2,2,3,3,3,3),
                 state= c(NJ,NJ,NJ,VT,VT,VT,VT,DC,DC,DC,DC,IL,IL,IL,IL),
                 start_dt= c(2010-01-01,2010-01-01,2010-01-01,2010-01-01,2010-01-02,2010-01-02,2010-01-02,2010-01-02,2010-02-03,2010-02-03,2010-02-03,2010-02-03,2010-03-05,2010-03-05,2010-03-05,2010-03-05),
                 end_dt= c(2011-01-01,2011-01-01,2011-01-01,2011-01-01,2011-01-02,2011-01-02,2011-01-02,2011-01-02,2011-02-03,2011-02-03,2011-02-03,2011-02-03,2011-03-05,2011-03-05,2011-03-05,2011-03-05),
                 value = c(12,7,6,9,15,7,6,9,18,5,6,3,20,5,5,6),
                 group = c("denominator", "Treated1", "Treated2", "Treated3","denominator", "Treated1", "Treated2", "Treated3","denominator", "Treated1", "Treated2", "Treated3","denominator", "Treated1", "Treated2", "Treated3"),
                 result = c(1,0.58,0.5,0.75,1,0.46...))

I also want to group the data by all the pop(population), state, start_dt,end_dt,and also by group and then divide each subgroup of group with the denominator of the same grouping above, to get the result column, and I tried with the accepeted answer and did something like:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

df <- df %>% 
  group_by(pop,state,start_dt,end_dt) %>% 
  mutate(result=value/value[group == "denominator"])

library(dplyr)
df <- df %>%
   group_by(pop,state,start_dt,end_dt) %>%
   summarize(result = value[group != "denominator"] / value[group == "denominator"])

But I got error:

group_by: 4 grouping variables (pop, state, start_dt, msr_prd_end_dt)
Error in `.fun()`:
! Problem while computing `result=value/value[group == "denominator"]`.
x `result` must be size 1, not 0.
i The error occurred in group 99: pop = "1", group = "Treated2", state =
  "DC", start_dt = 2010-01-01, end_dt = 2011-02-01.
Backtrace:
 1. ... %>% ...
 2. tidylog::mutate(., result=value/value[group == "denominator"])
 3. tidylog:::log_mutate(...)
 5. dplyr:::mutate.data.frame(.data, ...)

Any ideas?

>Solution :

The issue would be that at least one of the groups didn’t have denominator. We could use [ to subset the first element and coerce it to NA

library(dplyr)
df %>%
   group_by(pop,state,start_dt,end_dt) %>%
   summarize(result = value[group != "denominator"] / 
          value[group == "denominator"][1],
       group = group[group != "denominator"], .groups = "drop")
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading