I would like to know if there is a way to negate a random fraction of the values in a single column based on the values in another column in R. In the example dataframe below, I’d like to be able to randomly select 10% of the exposure values to be the same magnitude, but negative values, but only for the rows that have "Toy" listed as an object.
df <- data.frame(ChildID=c("M1", "F1", "F1", "F2", "M2", "M3", "M3", "M3", "M3", "F3", "F1", "F2", "M2", "M3"),
object=c("Mouth", "Toy", "Mouth", "Toy", "Toy", "Toy", "Mouth", "Toy", "Toy", "Mouth", "Toy", "Toy", "Toy", "Toy"),
exposure=c(0.1, 0.2, 0.1, 0.05, 0.6, 0.1, 0.4, 0.1, 1.0, 0.5, 0.1, 0.4, 0.1, 1.0))
Here’s what I would like the result to look like, for example.
| Child ID | object | exposure |
|---|---|---|
| M1 | Mouth | 0.1 |
| F1 | Toy | 0.2 |
| F1 | Mouth | 0.1 |
| F2 | Toy | 0.05 |
| M2 | Toy | -0.6 |
| M3 | Toy | 0.1 |
| M3 | Mouth | 0.4 |
| M3 | Toy | 0.1 |
| M3 | Toy | 1.0 |
| F3 | Mouth | 0.5 |
| F1 | Toy | 0.1 |
| F2 | Toy | 0.4 |
| M2 | Toy | 0.1 |
| M3 | Toy | 1.0 |
I tried using dplyr, but I can’t filter it because that removes the other rows that I don’t want to mutate. I realize this is a basic question, but I’m pulling my hair out trying to find the right work around. Thanks so much!
>Solution :
One option might be:
df %>%
mutate(rowid = 1:n(),
exposure_new = if_else(rowid %in% sample(rowid[object == "Toy"], floor((n()*10)/100)), -exposure, exposure)) %>%
select(-rowid)
ChildID object exposure exposure_new
1 M1 Mouth 0.10 0.10
2 F1 Toy 0.20 0.20
3 F1 Mouth 0.10 0.10
4 F2 Toy 0.05 0.05
5 M2 Toy 0.60 0.60
6 M3 Toy 0.10 0.10
7 M3 Mouth 0.40 0.40
8 M3 Toy 0.10 0.10
9 M3 Toy 1.00 1.00
10 F3 Mouth 0.50 0.50
11 F1 Toy 0.10 0.10
12 F2 Toy 0.40 -0.40
13 M2 Toy 0.10 0.10
14 M3 Toy 1.00 1.00
If the proportion should be computed from rows with a specific value only:
df %>%
mutate(rowid = 1:n(),
exposure_new = if_else(rowid %in% sample(rowid[object == "Toy"], floor((sum(object == "Toy")*10)/100)), -exposure, exposure)) %>%
select(-rowid)