Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How do I check a state-year group contains a proportion lesser or equal to 0.60?

I have the following data frame:

structure(list(state = c("CA", "CA", "CA", "CA", "CA", "CA", 
"WA", "WA", "WA", "PA", "PA"), prop = c(0.4, 0.5, 0.3, NA, 0.61, 
0.3, 0.76, 0.4, 0.35, 0.21, NA), year = c(2000, 2000, 2001, 2001, 
2001, 2002, 2002, 2002, 2004, 2000, 2005)), class = "data.frame", row.names = c(NA, 
-11L))

I want to check if the previous state-year group in the dataset contains a prop lesser or equal than 0.6.

So, my desired output is:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

structure(list(state = c("CA", "CA", "CA", "CA", "CA", "CA", 
"WA", "WA", "WA", "PA", "PA"), prop = c(0.4, 0.5, 0.3, NA, 0.61, 
0.3, 0.76, 0.4, 0.35, 0.21, NA), year = c(2000, 2000, 2001, 2001, 
2001, 2002, 2002, 2002, 2004, 2000, 2005), competitive = c(NA, 
NA, 1, 1, 1, 0, NA, NA, 0, NA, 1)), class = "data.frame", row.names = c(NA, 
-11L))

As you can tell, since there is no data for CA-1999, the competitive column says NA. And, for CA-2001, since all the prop in CA-2000 are less than 0.6, they are all coded as competitive==1. And, for CA-2002, since CA-2001 contains one observation greater than 0.6, competitiveness==1. And same for PA. PA-2000 is the state-year preceding PA-2005 in the dataset, so the competitive value for PA-2005 is based on PA-2000.

I’ve tried many different methods and keep getting odd results, such as:

df <- df %>%
  arrange(state, year) %>%
  group_by(state) %>%
  mutate(
    prev = lag(prop <= 0.60),
    competitive = ifelse(is.na(prev), NA, prev),
    competitive = ifelse(any(prop >= 0.60), 0, 1)
    )

This is only taking the previous entry, and doesn’t consider the entire group.

>Solution :

Here’s one method


input %>% 
  summarize(is_comp = all(prop<.6), .by=c(state, year)) %>% 
  mutate(is_prev_comp = lag(is_comp)+0, .by=c(state)) %>% 
  select(-is_comp) %>% 
  right_join(input)

which returns

#    state year is_prev_comp prop
# 1     CA 2000           NA 0.40
# 2     CA 2000           NA 0.50
# 3     CA 2001            1 0.30
# 4     CA 2001            1   NA
# 5     CA 2001            1 0.61
# 6     CA 2002            0 0.30
# 7     WA 2002           NA 0.76
# 8     WA 2002           NA 0.40
# 9     WA 2004            0 0.35
# 10    PA 2000           NA 0.21
# 11    PA 2005            1   NA
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading