Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Plot standard error in base r scatterplot

I am well aware of how to plot the standard error of a regression using ggplot. As an example with the iris dataset, this can be easily done with this code:

library(tidyverse)

iris %>% 
  ggplot(aes(x=Sepal.Width,
             y=Sepal.Length))+
  geom_point()+
  geom_smooth(method = "lm",
              se=T)

enter image description here

I also know that a regression using base R scatterplots can be achieved with this code:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

#### Scatterplot ####
plot(iris$Sepal.Width,
     iris$Sepal.Length)

#### Fit Regression ####
fit <- lm(iris$Sepal.Length ~ iris$Sepal.Width)

#### Fit Line to Plot ####
abline(fit, col="red")

enter image description here

However, I’ve tried looking up how to plot standard error in base R scatterplots, but all I have found both on SO and Google is how to do this with error bars. However, I would like to shade the error in a similar way as ggplot does above. How can one accomplish this?

Edit

To manually obtain the standard error of this regression, I believe you would calculate it like so:

#### Derive Standard Error ####
fit <- lm(Sepal.Length ~ Sepal.Width,
          iris)
n <- length(iris)
df <- n-2 # degrees of freedom
y.hat <- fitted(fit)
res <- resid(fit)
sq.res <- res^2
ss.res <- sum(sq.res)
se <- sqrt(ss.res/df)

So if this may allow one to fit it into a base R plot, I’m all ears.

>Solution :

Here’s a slightly fiddly approach using broom::augment to generate a dataset with predictions and standard errors. You could also do it in base R with predict if you don’t want to use broom but that’s a couple of extra lines.

Note: I was puzzled as to why the interval in my graph are narrower than your ggplot interval in the question. But a look at the geom_smooth documentation suggests that the se=TRUE option adds a 95% confidence interval rather than +-1 se as you might expect. So its probably better to generate your own intervals rather than letting the graphics package do it!

#### Fit Regression (note use of `data` argument) ####
fit <- lm(data=iris, Sepal.Length ~ Sepal.Width)

#### Generate predictions and se ####
dat <- broom::augment(fit, se_fit = TRUE)

#### Alternative using `predict` instead of broom ####
dat <- cbind(iris, 
             .fitted=predict(fit, newdata = iris), 
             .se.fit=predict(fit, newdata = iris, se.fit = TRUE)$se.fit)

#### Now sort the dataset in the x-axis order
dat <- dat[order(dat$`Sepal.Width`),]

#### Plot with predictions and standard errors
with(dat, {
  plot(Sepal.Width,Sepal.Length)
  polygon(c(Sepal.Width, rev(Sepal.Width)), c(.fitted+.se.fit, rev(.fitted-.se.fit)), border = NA, col = hsv(1,1,1,0.2))
  lines(Sepal.Width,.fitted, lwd=2)
  lines(Sepal.Width,.fitted+.se.fit, col="red")
  lines(Sepal.Width,.fitted-.se.fit, col="red")
  })

enter image description here

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading