Say I have these data:
clear all
input id var1 var2 var3 key
1 23 45 8 2
2 11 7 99 3
3 29 22 1 1
4 5 61 5 2
end
How can I create another variable called want
that uses the key
variable as a key to identify which variable (among var1
, var2
, and var3
) from which to copy the value? As an example, the first row (id
1) has a key value of 2
. Thus, we want to assign want
the value that var2
(where we care about var2
because the key is 2) has, which is 45
.
Here is the end result that I would like to obtain:
clear all
input id var1 var2 var3 key want
1 23 45 8 2 45
2 11 7 99 3 99
3 29 22 1 1 29
4 5 61 5 2 61
end
Is there a way to do this without resorting to a loop through all observations? (I think if I were to take this row-by-row approach, for each iteration, I could create a local (with value var1
, var2
, or var3
depending on the key) that I could use to get the value. But with millions of observations, this approach might be slow.)
Of course, one could create a variable like this (gen var = "var"+string(key)
), but then I am not sure if it is possible to use this to access other variables like I want. So this is probably a dead end.
>Solution :
Here are two ways to do it.
clear all
input id var1 var2 var3 key
1 23 45 8 2
2 11 7 99 3
3 29 22 1 1
4 5 61 5 2
end
gen wanted = .
forval j = 1/3 {
replace wanted = var`j' if key == `j'
}
gen WANTED = cond(key == 1, var1, cond(key == 2, var2, var3))
The second way is surely preferable if the real problem is about as complicated as the presented problem. The first way seems preferable if the real problem involves several variables, as a loop over several variables is about as much code as a loop over few.