lapply and function arguments with different scope

Advertisements

I encountered strange behavior when using lapply to bootstrap a GLM. Each iteration of the lapply uses a different weight, but the formula variable is the same. Thus, the latter was kept outside the anonymous function.

Below is a reproducible toy example.

The following code runs as expected:

library(dplyr)
data_adult <-read.csv("https://raw.githubusercontent.com/guru99-edu/R-Programming/master/adult.csv")
data_adult$Y  <- (data_adult$hours.per.week > 40)

est_boot <- lapply(1:10, function(bb){
                       ff  <- as.formula('Y ~ gender')
                       w  <-  rexp( nrow(data_adult), 1)
                       glmout  <- glm( ff,  'quasibinomial', data_adult, w )
                       return(coef(glmout))
})

Whereas the following does not:

ff  <- as.formula('Y ~ gender')
est_boot <- lapply(1:10, function(bb){
                       w  <-  rexp( nrow(data_adult), 1)
                       glmout  <- glm( ff,  'quasibinomial', data_adult, w )
                       return(coef(glmout))
})
Error in eval(extras, data, env) : object 'w' not found

I thought maybe the function needs all the arguments to defined locally. However, data_adult is not. Why is w not recognized when ff is defined outside the function?

I am using R 4.3.0.

>Solution :

In R, a formula has an attribute called .Environment which you can see in your second version by calling

attributes(ff)
#> $class
#> [1] "formula"
#> 
#> $.Environment
#> <environment: R_GlobalEnv>

When a formula is parsed, the .Environment attribute is used as a starting point on the search path to find the variables it references. The formula cannot find w because it does not exist in the global environment. You can get round this by assigning the local environment to the .Environment attribute inside lapply

ff  <- as.formula('Y ~ gender')

est_boot <- lapply(1:10, function(bb){
  w  <-  rexp( nrow(data_adult), 1)
  environment(ff) <- environment()
  glmout  <- glm(ff,  'quasibinomial', data_adult, w )
  return(coef(glmout))
})

Leave a ReplyCancel reply