Plot standard error in base r scatterplot

January 5, 2023

I am well aware of how to plot the standard error of a regression using ggplot. As an example with the iris dataset, this can be easily done with this code:

library(tidyverse)

iris %>% 
  ggplot(aes(x=Sepal.Width,
             y=Sepal.Length))+
  geom_point()+
  geom_smooth(method = "lm",
              se=T)

I also know that a regression using base R scatterplots can be achieved with this code:

#### Scatterplot ####
plot(iris$Sepal.Width,
     iris$Sepal.Length)

#### Fit Regression ####
fit <- lm(iris$Sepal.Length ~ iris$Sepal.Width)

#### Fit Line to Plot ####
abline(fit, col="red")

However, I’ve tried looking up how to plot standard error in base R scatterplots, but all I have found both on SO and Google is how to do this with error bars. However, I would like to shade the error in a similar way as ggplot does above. How can one accomplish this?

Edit

To manually obtain the standard error of this regression, I believe you would calculate it like so:

#### Derive Standard Error ####
fit <- lm(Sepal.Length ~ Sepal.Width,
          iris)
n <- length(iris)
df <- n-2 # degrees of freedom
y.hat <- fitted(fit)
res <- resid(fit)
sq.res <- res^2
ss.res <- sum(sq.res)
se <- sqrt(ss.res/df)

So if this may allow one to fit it into a base R plot, I’m all ears.

>Solution :

Here’s a slightly fiddly approach using broom::augment to generate a dataset with predictions and standard errors. You could also do it in base R with predict if you don’t want to use broom but that’s a couple of extra lines.

Note: I was puzzled as to why the interval in my graph are narrower than your ggplot interval in the question. But a look at the geom_smooth documentation suggests that the se=TRUE option adds a 95% confidence interval rather than +-1 se as you might expect. So its probably better to generate your own intervals rather than letting the graphics package do it!

#### Fit Regression (note use of `data` argument) ####
fit <- lm(data=iris, Sepal.Length ~ Sepal.Width)

#### Generate predictions and se ####
dat <- broom::augment(fit, se_fit = TRUE)

#### Alternative using `predict` instead of broom ####
dat <- cbind(iris, 
             .fitted=predict(fit, newdata = iris), 
             .se.fit=predict(fit, newdata = iris, se.fit = TRUE)$se.fit)

#### Now sort the dataset in the x-axis order
dat <- dat[order(dat$`Sepal.Width`),]

#### Plot with predictions and standard errors
with(dat, {
  plot(Sepal.Width,Sepal.Length)
  polygon(c(Sepal.Width, rev(Sepal.Width)), c(.fitted+.se.fit, rev(.fitted-.se.fit)), border = NA, col = hsv(1,1,1,0.2))
  lines(Sepal.Width,.fitted, lwd=2)
  lines(Sepal.Width,.fitted+.se.fit, col="red")
  lines(Sepal.Width,.fitted-.se.fit, col="red")
  })