Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Generate dummies in Stata vs. in R

Stata

r, u, s are dummies. I’m wondering if the following line is also generating dummy n, if r or u or s ==1, but just omit ==1 after r, u, s?

generate byte n = r | u | s

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

R

Does it make a difference when we generate byte and variable in R or it’s the same in R?

>Solution :

This answer addresses Stata questions only.

In Stata if r u s are all 0, 1 variables then r | u | s is also 0, 1 and will be 1 if any of those is 1 and 0 if and only if all are 0. So, it is equivalent to max(r, u, s).

But watch out if r u s are 0, 1 or missing, then r | u | s will also be 1 if any of those is missing. But max(r, u, s) will be missing only if all of those are missing.

If missings are present, then you could use

  * 1 
  gen n = r | u | s if !missing(r, u, s) 

The result will be 1 if any argument r u s is 1, 0 if all arguments are 0, and missing if any argument is missing.

  * 2 
  gen n = (r == 1) | (u == 1) | (s == 1) 

The result will be 1 if any argument is 1 and 0 otherwise. "Otherwise" is anything from all 0s to all missings.

  * 3 
  gen n = inlist(1, r, u, s) 

#3 is equivalent to #2.

In all cases, specifying byte is good practice to save on storage, but not material otherwise.

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading