Home How can I randomly modify a fraction of the values within a column to a new value based on a condition from another row using R?

Questions

How can I randomly modify a fraction of the values within a column to a new value based on a condition from another row using R?

byMR

October 25, 2022

I would like to know if there is a way to negate a random fraction of the values in a single column based on the values in another column in R. In the example dataframe below, I’d like to be able to randomly select 10% of the exposure values to be the same magnitude, but negative values, but only for the rows that have "Toy" listed as an object.

df <- data.frame(ChildID=c("M1", "F1", "F1", "F2", "M2", "M3", "M3", "M3", "M3", "F3", "F1", "F2", "M2", "M3"),
                object=c("Mouth", "Toy", "Mouth", "Toy", "Toy", "Toy", "Mouth", "Toy", "Toy", "Mouth", "Toy", "Toy", "Toy", "Toy"),
                exposure=c(0.1, 0.2, 0.1, 0.05, 0.6, 0.1, 0.4, 0.1, 1.0, 0.5, 0.1, 0.4, 0.1, 1.0))

Here’s what I would like the result to look like, for example.

Child ID	object	exposure
M1	Mouth	0.1
F1	Toy	0.2
F1	Mouth	0.1
F2	Toy	0.05
M2	Toy	-0.6
M3	Toy	0.1
M3	Mouth	0.4
M3	Toy	0.1
M3	Toy	1.0
F3	Mouth	0.5
F1	Toy	0.1
F2	Toy	0.4
M2	Toy	0.1
M3	Toy	1.0

I tried using dplyr, but I can’t filter it because that removes the other rows that I don’t want to mutate. I realize this is a basic question, but I’m pulling my hair out trying to find the right work around. Thanks so much!

>Solution :

One option might be:

df %>%
 mutate(rowid = 1:n(),
        exposure_new = if_else(rowid %in% sample(rowid[object == "Toy"], floor((n()*10)/100)), -exposure, exposure)) %>%
 select(-rowid)

   ChildID object exposure exposure_new
1       M1  Mouth     0.10         0.10
2       F1    Toy     0.20         0.20
3       F1  Mouth     0.10         0.10
4       F2    Toy     0.05         0.05
5       M2    Toy     0.60         0.60
6       M3    Toy     0.10         0.10
7       M3  Mouth     0.40         0.40
8       M3    Toy     0.10         0.10
9       M3    Toy     1.00         1.00
10      F3  Mouth     0.50         0.50
11      F1    Toy     0.10         0.10
12      F2    Toy     0.40        -0.40
13      M2    Toy     0.10         0.10
14      M3    Toy     1.00         1.00

If the proportion should be computed from rows with a specific value only:

df %>%
 mutate(rowid = 1:n(),
        exposure_new = if_else(rowid %in% sample(rowid[object == "Toy"], floor((sum(object == "Toy")*10)/100)), -exposure, exposure)) %>%
 select(-rowid)