I’m working with the Boston Housing data set in the MASS package. The desired goal is something like this:
library(MASS)
library(tidyverse)
library(gam)
Boston.splines <- gam(medv ~ s(crime) + s(zn) + s(indus), data = Boston)
I can get everything except the spline function to work automatically:
names_Boston <- names(Boston[,1:4])
f1 <- paste("medv ~", paste(names_Boston, collapse = "+"))
f1 <- as.formula(f1)
Boston1.gam <- gam(f1, data = Boston)
But for the life of me I can’t seem to get the s() function to be added to the front of each of the column names.
I’ve tried dplyr and base R, nothing works. For example, this:
set_names(paste0('s(', paste0(names_Boston), paste0')))
returned an error message:
Error: unexpected string constant in "set_names(paste0('s(', paste0(names_Boston), paste0')'"
What is a way to automatically add the smoothing spline function to column names to result in a formula such as gam(medv ~ s(crime) + s(zn) + s(indus), data = Boston)?
>Solution :
This works for me:
library(MASS)
library(gam)
#> Loading required package: splines
#> Loading required package: foreach
#> Loaded gam 1.22-2
names_Boston <- names(Boston[,1:3])
f1 <- as.formula(paste0('medv ~', paste0('s(', names_Boston, ')', collapse = '+')))
gam(f1, data = Boston)
#> Call:
#> gam(formula = f1, data = Boston)
#>
#> Degrees of Freedom: 505 total; 493.0002 Residual
#> Residual Deviance: 26249.62
Created on 2023-06-01 with reprex v2.0.2
But column 4 cannot be smoothed due to:
A smoothing variable encountered with 3 or less unique values; at least 4 needed