mutate column using case_when for n% of the group

I have a data frame

df<-data.frame(id=rep(1:10,each=10),
               Room1=rnorm(100,0.4,0.5),
               Room2=rnorm(100,0.3,0.5),
               Room3=rnorm(100,0.7,0.5))

I want to mutate Room1 column by group (those in id = 10) using case_when:

data <- df %>%
  mutate(Room1 = case_when(
    id==10 ~ 0.6,
    TRUE ~ as.numeric(Room1)
))

But only for 20% of the rows for id=10. The 20% should be randomly assigned. Can anyone help? Thanks in advance

>Solution :

Group by id, and use dplyr::percent_rank(runif(n())) <= .2 to select a random 20% of cases within id.

I assume you intend to add more conditions to your case_when() — otherwise, you can use if_else() instead.

set.seed(13)
library(dplyr)  

data <- df %>%
  group_by(id) %>% 
  mutate(Room1 = case_when(
    id == 10 & percent_rank(runif(n())) <= .2 ~ 0.6,
    TRUE ~ Room1
  )) %>% 
  ungroup()

tail(data, 10)
# A tibble: 10 × 4
      id  Room1   Room2   Room3
   <int>  <dbl>   <dbl>   <dbl>
 1    10  0.590  0.801   0.745 
 2    10  0.117  0.517  -0.491 
 3    10 -0.207  0.533   2.15  
 4    10 -0.282 -0.249   0.828 
 5    10  0.6    0.605   0.778 
 6    10  0.272  0.308   0.0575
 7    10 -0.213  0.668   0.476 
 8    10  0.507  0.923  -0.0948
 9    10  0.434 -0.0663  0.0720
10    10  0.6    0.264   0.647 

Leave a Reply