Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How do I change predictors in linear regression in loop in R?

How do I change predictors in linear regression in loop in R?

Below is an example along with the error. Can someone please fix it.

# sample data 
mpg <- mpg

str(mpg)

# array of predictors
predictors <- c("hwy", "cty")

# loop over predictors
for (predictor in predictors) 
{
  # fit linear regression
  model <- lm(formula = predictor ~ displ + cyl,
              data = mpg)
  
  # summary of model
  summary(model)
}

Error

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

Error in model.frame.default(formula = predictor ~ displ + cyl, data = mpg,  : 
  variable lengths differ (found for 'displ')

>Solution :

We may use paste or reformulate. Also, as it is a for loop, create an object to store the output from summary

sumry_model <- vector('list', length(predictors))
names(sumry_model) <- predictors
for (predictor in predictors) {
  # fit linear regression
  model <- lm(reformulate(c("displ", "cyl"), response = predictor),
              data = mpg)
  # with paste
  # model <- lm(formula = paste0(predictor, "~ displ + cyl"), data = mpg)
  
  # summary of model
    sumry_model[[predictor]] <- summary(model)
}

-output

> sumry_model
$hwy

Call:
lm(formula = reformulate(c("displ", "cyl"), response = predictor), 
    data = mpg)

Residuals:
    Min      1Q  Median      3Q     Max 
-7.5098 -2.1953 -0.2049  1.9023 14.9223 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)  38.2162     1.0481  36.461  < 2e-16 ***
displ        -1.9599     0.5194  -3.773 0.000205 ***
cyl          -1.3537     0.4164  -3.251 0.001323 ** 
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 3.759 on 231 degrees of freedom
Multiple R-squared:  0.6049,    Adjusted R-squared:  0.6014 
F-statistic: 176.8 on 2 and 231 DF,  p-value: < 2.2e-16


$cty

Call:
lm(formula = reformulate(c("displ", "cyl"), response = predictor), 
    data = mpg)

Residuals:
    Min      1Q  Median      3Q     Max 
-5.9276 -1.4750 -0.0891  1.0686 13.9261 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)  28.2885     0.6876  41.139  < 2e-16 ***
displ        -1.1979     0.3408  -3.515 0.000529 ***
cyl          -1.2347     0.2732  -4.519 9.91e-06 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 2.466 on 231 degrees of freedom
Multiple R-squared:  0.6671,    Adjusted R-squared:  0.6642 
F-statistic: 231.4 on 2 and 231 DF,  p-value: < 2.2e-16

This may be also done as a multivariate response

summary(lm(cbind(hwy, cty) ~ displ + cyl, data = mpg))

Or if we want to use predictors

summary(lm(as.matrix(mpg[predictors]) ~ displ + cyl, data = mpg))
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading