Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

R: Problem using a for loop to modify existing variables in a data.table; the loop does not affect the row filtering

Thanks in advance, and sorry if something is unclear, it’s my first time posting here. I am working on something that should be fairly simple, but I cannot seem to find a way of making it work.

The task that I want to complete is the following:
I have a dataset with hundreds of variables. I need to recode all of them following the same logic. The logic is the following: if the GIVEN VARIABLE == 0 & a SPECIFIC VARIABLE == 1, the GIVEN VARIABLE must = -1. The SPECIFIC VARIABLE is the same for all of them.

What I have done is the following:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

set.seed(123)
data=data.table(a = 0:10, b= 0:10, c = 0:10, d = 1:0)

Here "d" is the SPECIFIC VARIABLE and a:c are the GIVEN VARIABLEs

list_variables <- names(data)  
list_variables_v2 <- list_variables[-c(4)] 

I extracted the names of the variables from the dataset (minus d) and put them on a list, so they can be fed into the loop

data_v1 = copy(d)     

for(i in (list_variables_v2)) {
  data_v1[(i) == 0 & d == 1, (i) := -1]
}

Problematically, when I run the loop nothing happens. Those variables that comply with the condition (e.g. a == 0 & d == 1) are not recoded as -1. Various problems could be happening, but I think I have reduced them to one. Potential problems:

a) The code, even outside the loop, does not work. But this is not true. The following code produces the expected result:

data_v1[a == 0 & d == 1, a := -1]

b) The loop is not working, hence, the variable names are not really sorted and recognized. Nonetheless, if I exclude the (i) == 0 condition, the code does work, implying that the loop works for the right side:

for(i in (list_variables_v2)) {
  data_v1[d == 1, (i) := -1]
}

I think that the root of the problem is that R, in the row filtering side, is not recognizing (i) == 0 as e.g. a == 0. This is quite weird given that R, when dealing with the right side (columns), does recognize that (i) := -1 as e.g. a := -1. Any idea of what might be causing this and, hopefully, how to solve it?

Again, many many thanks, and please let me know if something is unclear or repeated.

>Solution :

A simple correction would be to wrap with get

for(i in (list_variables_v2)) {
  data_v1[get(i) == 0 & d == 1, (i) := -1]
}

-output

> data_v1
        a     b     c     d
    <int> <int> <int> <int>
 1:    -1    -1    -1     1
 2:     1     1     1     0
 3:     2     2     2     1
 4:     3     3     3     0
 5:     4     4     4     1
 6:     5     5     5     0
 7:     6     6     6     1
 8:     7     7     7     0
 9:     8     8     8     1
10:     9     9     9     0
11:    10    10    10     1

> data
        a     b     c     d
    <int> <int> <int> <int>
 1:     0     0     0     1
 2:     1     1     1     0
 3:     2     2     2     1
 4:     3     3     3     0
 5:     4     4     4     1
 6:     5     5     5     0
 7:     6     6     6     1
 8:     7     7     7     0
 9:     8     8     8     1
10:     9     9     9     0
11:    10    10    10     1
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading