Create variable that uses value of another variable to determine which other variable from which to take value

Say I have these data:

clear all
input id var1 var2 var3 key
      1  23   45   8    2
      2  11   7    99   3
      3  29   22   1    1
      4  5    61   5    2
end

How can I create another variable called want that uses the key variable as a key to identify which variable (among var1, var2, and var3) from which to copy the value? As an example, the first row (id 1) has a key value of 2. Thus, we want to assign want the value that var2 (where we care about var2 because the key is 2) has, which is 45.

Here is the end result that I would like to obtain:

clear all
input id var1 var2 var3 key want
      1  23   45   8    2   45
      2  11   7    99   3   99
      3  29   22   1    1   29
      4  5    61   5    2   61
end

Is there a way to do this without resorting to a loop through all observations? (I think if I were to take this row-by-row approach, for each iteration, I could create a local (with value var1, var2, or var3 depending on the key) that I could use to get the value. But with millions of observations, this approach might be slow.)

Of course, one could create a variable like this (gen var = "var"+string(key)), but then I am not sure if it is possible to use this to access other variables like I want. So this is probably a dead end.

>Solution :

Here are two ways to do it.

clear all
input id var1 var2 var3 key
      1  23   45   8    2
      2  11   7    99   3
      3  29   22   1    1
      4  5    61   5    2
end

gen wanted = . 
forval j = 1/3 { 
    replace wanted = var`j' if key == `j'
}

gen WANTED = cond(key == 1, var1, cond(key == 2, var2, var3))

The second way is surely preferable if the real problem is about as complicated as the presented problem. The first way seems preferable if the real problem involves several variables, as a loop over several variables is about as much code as a loop over few.

Leave a Reply