Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

sample values by group with conditions

I have grouped data and I want to create a new variable value that will take the value 0 or 1.

  1. Every group needs at least one observation where value==1.
  2. But groups cannot have more than 2 observations where value==1.
  3. Ideally I can set it so no more than 25% of groups only have one observation where value==1.
library(tidyverse)
set.seed(1)

# sample can break the rules
tibble(group = c(rep("A", 3),
                 rep("B", 6),
                 rep("C", 4),
                 rep("D", 5))) %>%
  group_by(group) %>%
  mutate(value = sample(c(0, 1), n(), replace = TRUE, prob = c(0.8, 0.2)))

>Solution :

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

One solution would be to create a listing of your unique group labels and shuffle those (here I get the unique group labels via nest). Then depending on whether the group is in the first 25% of rows of the data frame, you can assign either a) a random number between 1 and 2, or b) always 2. Finally, you can use the assigned number to define how 0s and 1s should be sampled for each group, and then unnest the result.

set.seed(0)
result <- df %>% 
  nest(data = -group) %>% 
  .[sample(1:nrow(.), nrow(.)), ] %>% # shuffle the group order
  mutate(
    value_count = ifelse(row_number() / n() <= 0.25, sample(1:2, n(), replace = T), 2)
  ) %>% 
  rowwise() %>% 
  mutate(
    count = nrow(data),
    value = list(sample(c(rep(1, value_count), rep(0, count - value_count)), count))
  ) %>% 
  unnest(value) %>% 
  select(-data, -value_count, -count)

   group value
   <chr> <dbl>
 1 B         0
 2 B         0
 3 B         0
 4 B         0
 5 B         1
 6 B         0
 7 A         1
 8 A         1
 9 A         0
10 D         1
11 D         0
12 D         1
13 D         0
14 D         0
15 C         1
16 C         0
17 C         0
18 C         1
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading