I want to create residuals vs fitted plots for each of my numeric variables.
An example:
plot(glm(Outcome~Age, family="binomial", data=pima_train))
Which plots:
My current code
mod_summaries <- list() # creating empty list
for(i in 2:ncol(select_if(is.numeric(pima_train)))) { # Head of for-loop
predictors_i <- colnames(pima_train)[2:i] # Create vector of predictor names
mod_summaries[[i - 1]] <- summary( # Store regression model summary in list
lm(y ~ ., data[ , c("y", predictors_i)]))
}
Is it possible to get code that doesn’t need me to create an entire dataset of only numeric variables? Saving RAM space would be good just to see what that would look like.
>Solution :
You could identify all the numeric covariates when initiating the loop and avoid the numeric_train dataset:
Data
set.seed(123)
pima_train <- data.frame(Outcome = as.factor(sample(0:1, 100, replace = TRUE)),
pred1 = runif(100),
pred2 = runif(100),
pred3 = sample(LETTERS, 100, replace = TRUE),
pred4 = runif(100))
Use is.numeric and loop through the columns with lapply to determine the numeric values:
for(i in names(pima_train)[unlist(lapply(pima_train, is.numeric))]){
plot(glm(Outcome ~ get(i), family = "binomial", data = pima_train))
}
## or alternative, use an `if` statement in the loop
for(i in names(pima_train)) {
if (is.numeric(pima_train[, i])) {
plot(glm(Outcome ~ get(i), family = "binomial", data = pima_train))
}
}
You could also use lapply to avoid a for loop altogether:
lapply(df[-1], function(x) if(is.numeric(x)){
plot(glm(Outcome ~ x, family = "binomial", data = pima_train))})
