Generate dummies in Stata vs. in R

Advertisements

Stata

r, u, s are dummies. I’m wondering if the following line is also generating dummy n, if r or u or s ==1, but just omit ==1 after r, u, s?

generate byte n = r | u | s

R

Does it make a difference when we generate byte and variable in R or it’s the same in R?

>Solution :

This answer addresses Stata questions only.

In Stata if r u s are all 0, 1 variables then r | u | s is also 0, 1 and will be 1 if any of those is 1 and 0 if and only if all are 0. So, it is equivalent to max(r, u, s).

But watch out if r u s are 0, 1 or missing, then r | u | s will also be 1 if any of those is missing. But max(r, u, s) will be missing only if all of those are missing.

If missings are present, then you could use

  * 1 
  gen n = r | u | s if !missing(r, u, s) 

The result will be 1 if any argument r u s is 1, 0 if all arguments are 0, and missing if any argument is missing.

  * 2 
  gen n = (r == 1) | (u == 1) | (s == 1) 

The result will be 1 if any argument is 1 and 0 otherwise. "Otherwise" is anything from all 0s to all missings.

  * 3 
  gen n = inlist(1, r, u, s) 

#3 is equivalent to #2.

In all cases, specifying byte is good practice to save on storage, but not material otherwise.

Leave a ReplyCancel reply