Home R: Problem using a for loop to modify existing variables in a data.table; the loop does not affect the row filtering

Questions

R: Problem using a for loop to modify existing variables in a data.table; the loop does not affect the row filtering

byMR

March 29, 2022

Thanks in advance, and sorry if something is unclear, it’s my first time posting here. I am working on something that should be fairly simple, but I cannot seem to find a way of making it work.

The task that I want to complete is the following:
I have a dataset with hundreds of variables. I need to recode all of them following the same logic. The logic is the following: if the GIVEN VARIABLE == 0 & a SPECIFIC VARIABLE == 1, the GIVEN VARIABLE must = -1. The SPECIFIC VARIABLE is the same for all of them.

What I have done is the following:

set.seed(123)
data=data.table(a = 0:10, b= 0:10, c = 0:10, d = 1:0)

Here "d" is the SPECIFIC VARIABLE and a:c are the GIVEN VARIABLEs

list_variables <- names(data)  
list_variables_v2 <- list_variables[-c(4)]

I extracted the names of the variables from the dataset (minus d) and put them on a list, so they can be fed into the loop

data_v1 = copy(d)     

for(i in (list_variables_v2)) {
  data_v1[(i) == 0 & d == 1, (i) := -1]
}

Problematically, when I run the loop nothing happens. Those variables that comply with the condition (e.g. a == 0 & d == 1) are not recoded as -1. Various problems could be happening, but I think I have reduced them to one. Potential problems:

a) The code, even outside the loop, does not work. But this is not true. The following code produces the expected result:

data_v1[a == 0 & d == 1, a := -1]

b) The loop is not working, hence, the variable names are not really sorted and recognized. Nonetheless, if I exclude the (i) == 0 condition, the code does work, implying that the loop works for the right side:

for(i in (list_variables_v2)) {
  data_v1[d == 1, (i) := -1]
}

I think that the root of the problem is that R, in the row filtering side, is not recognizing (i) == 0 as e.g. a == 0. This is quite weird given that R, when dealing with the right side (columns), does recognize that (i) := -1 as e.g. a := -1. Any idea of what might be causing this and, hopefully, how to solve it?

Again, many many thanks, and please let me know if something is unclear or repeated.

>Solution :

A simple correction would be to wrap with get

for(i in (list_variables_v2)) {
  data_v1[get(i) == 0 & d == 1, (i) := -1]
}

-output

> data_v1
        a     b     c     d
    <int> <int> <int> <int>
 1:    -1    -1    -1     1
 2:     1     1     1     0
 3:     2     2     2     1
 4:     3     3     3     0
 5:     4     4     4     1
 6:     5     5     5     0
 7:     6     6     6     1
 8:     7     7     7     0
 9:     8     8     8     1
10:     9     9     9     0
11:    10    10    10     1

> data
        a     b     c     d
    <int> <int> <int> <int>
 1:     0     0     0     1
 2:     1     1     1     0
 3:     2     2     2     1
 4:     3     3     3     0
 5:     4     4     4     1
 6:     5     5     5     0
 7:     6     6     6     1
 8:     7     7     7     0
 9:     8     8     8     1
10:     9     9     9     0
11:    10    10    10     1