Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Is it possible to customize outliers calculation formula in ggplot?

mydata3<-data.frame(x=(c(1:10,20,100,120)))

mydata3%>%ggplot(aes(x=x))+
  geom_boxplot(
  )

enter image description here

What is the formula of outliers calculation in ggplot ? Quartile1- 1.5*IQR and Quartile3 +1.5*IQR ?
If so, can I mannualy change this rule ?

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

>Solution :

Yes, your calculation for outliers is correct. From the docs:

The lower and upper hinges correspond to the first and third quartiles (the 25th and 75th percentiles).

and

The upper whisker extends from the hinge to the largest value no further than 1.5 * IQR from the hinge (where IQR is the inter-quartile range, or distance between the first and third quartiles). The lower whisker extends from the hinge to the smallest value at most 1.5 * IQR of the hinge.

To change the positions of the whiskers and hinges, we can use stat = "summary" inside geom_boxplot, and pass our own summary function. This needs to be a function that takes the passed values and returns a data frame with columns ymin, lower, middle, upper and ymax.

In this example, we set the whisker tips at 10th and 90th centile, the hinges at 40th and 60th centile, and the middle line at the 50th centile:

mydata3 <- data.frame(x=(c(1:10,20,100,120)))

mydata3 %>% 
  ggplot(aes(y = 0, x = x)) +
  geom_boxplot(stat = "summary",
               fun.data = ~ data.frame(ymin   = quantile(.x, 0.1),
                                       lower  = quantile(.x, 0.4),
                                       middle = quantile(.x, 0.5),
                                       upper  = quantile(.x, 0.6),
                                       ymax   = quantile(.x, 0.9)),
               orientation = "y") 

enter image description here

Note that if we change the quantiles in the above function, we get a different boxplot:

mydata3 %>% 
  ggplot(aes(y = 0, x = x)) +
  geom_boxplot(stat = "summary",
               fun.data = ~ data.frame(ymin   = quantile(.x, 0.05),
                                       lower  = quantile(.x, 0.1),
                                       middle = quantile(.x, 0.5),
                                       upper  = quantile(.x, 0.9),
                                       ymax   = quantile(.x, 0.95)),
               orientation = "y") 

enter image description here

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading