Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Why is stats::loess and geom_smooth(method = "loess") different?

geom_smooth() (RED) appears to be more "smooth" when plotted in ggplot2 than if I plot the values of stats::loess with geom_line() (BLUE).

Why? And how do you make the geom_line() like the line produced by geom_smooth()?

Reprex:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

# Data
data <- structure(list(date_int = c(0.834136630343671, 0.848910310142498, 
    0.851948868398994, 0.857082984073764, 0.866093880972339, 0.86955155071249, 
    0.874895222129086, 0.925660100586756, 0.937709555741827, 0.957355406538139, 
    0.977525146689019, 0.996070829840738, 0.998428331936295, 0.998428331936295, 
    0.998480720871752, 0.998795054484493, 0.999161777032691, 0.999528499580889, 
    0.999895222129086, 1, 1), value = c(51.78, 46.2, 44.01, 41.1, 
    39.1, 38.19, 42.87, 42.47, 37.22, 41.6, 44.7, 39.7, 23, 28.7, 
    23, 30.9, 35.4, 35.8, 32.4, 31, 31)), row.names = c(NA, -21L), class = c("tbl_df", 
    "tbl", "data.frame"))

# Add manually added loess values
data <- data %>%
  mutate(pred_loess = stats::loess(value ~ date_int, method = "loess")$fitted)

# Plot red and blue
ggplot(data,
       aes(x = date_int,
           y = value)) +
  geom_point() +
  geom_smooth(colour = "red", size = 1, se = FALSE) +
  geom_line(aes(y = pred_loess), colour = "blue", size = 1, se = FALSE) +
  labs(title = "RED (geom_smooth) is smoother\nthan BLUE (geom_line)")

enter image description here

>Solution :

To manually plot the loess line, make a new dataframe with regularly spaced x-values and use the predict() function to find the values for the y-variable.

library(dplyr)
library(ggplot2)

# Data
data <- structure(list(date_int = c(0.834136630343671, 0.848910310142498, 
                                    0.851948868398994, 0.857082984073764, 0.866093880972339, 0.86955155071249, 
                                    0.874895222129086, 0.925660100586756, 0.937709555741827, 0.957355406538139, 
                                    0.977525146689019, 0.996070829840738, 0.998428331936295, 0.998428331936295, 
                                    0.998480720871752, 0.998795054484493, 0.999161777032691, 0.999528499580889, 
                                    0.999895222129086, 1, 1), value = c(51.78, 46.2, 44.01, 41.1, 
                                                                        39.1, 38.19, 42.87, 42.47, 37.22, 41.6, 44.7, 39.7, 23, 28.7, 
                                                                        23, 30.9, 35.4, 35.8, 32.4, 31, 31)), row.names = c(NA, -21L), class = c("tbl_df", 
                                                                                                                                                 "tbl", "data.frame"))

fit <- stats::loess(value ~ date_int, data = data)

# Make data.frame for loess trend
fit_df <- data.frame(
  date_int = seq(min(data$date_int), max(data$date_int), length.out = 500)
)
fit_df$value <- predict(fit, newdata = fit_df)

# Plot red and blue
ggplot(data,
       aes(x = date_int,
           y = value)) +
  geom_point() +
  geom_smooth(colour = "red", size = 1, se = FALSE) +
  geom_line(data = fit_df, colour = "blue", size = 1) +
  labs(title = "RED (geom_smooth) is smoother\nthan BLUE (geom_line)")
#> `geom_smooth()` using method = 'loess' and formula 'y ~ x'

Created on 2022-04-20 by the reprex package (v0.3.0)

As mentioned in the comments, your previous approach only gave fitted values for the datapoints in your dataframe (and not a sequence along the x-axis).

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading