Prediction model for loop

Advertisements

I want to create residuals vs fitted plots for each of my numeric variables.

An example:

plot(glm(Outcome~Age, family="binomial", data=pima_train))

Which plots:

My current code

mod_summaries <- list() # creating empty list

for(i in 2:ncol(select_if(is.numeric(pima_train)))) {    # Head of for-loop
 
  predictors_i <- colnames(pima_train)[2:i]    # Create vector of predictor names
  mod_summaries[[i - 1]] <- summary(     # Store regression model summary in list
    lm(y ~ ., data[ , c("y", predictors_i)]))
}

Is it possible to get code that doesn’t need me to create an entire dataset of only numeric variables? Saving RAM space would be good just to see what that would look like.

>Solution :

You could identify all the numeric covariates when initiating the loop and avoid the numeric_train dataset:

Data

set.seed(123)
pima_train <- data.frame(Outcome = as.factor(sample(0:1, 100, replace = TRUE)),
                 pred1 = runif(100),
                 pred2 = runif(100),
                 pred3 = sample(LETTERS, 100, replace = TRUE),
                 pred4 = runif(100))

Use is.numeric and loop through the columns with lapply to determine the numeric values:

for(i in names(pima_train)[unlist(lapply(pima_train, is.numeric))]){
  plot(glm(Outcome ~ get(i), family = "binomial", data = pima_train))
}

## or alternative, use an `if` statement in the loop
for(i in names(pima_train)) {
  if (is.numeric(pima_train[, i])) {
    plot(glm(Outcome ~ get(i), family = "binomial", data = pima_train))
  }
}

You could also use lapply to avoid a for loop altogether:

lapply(df[-1], function(x) if(is.numeric(x)){
  plot(glm(Outcome ~ x, family = "binomial", data = pima_train))})

Leave a ReplyCancel reply