Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How to remove low frequency bins in histogram

Let’s say I’ve a data frame containing an array of numbers which I want to visualise in a histogram. What I want to achieve is to show only the bins containing more than let’s say 50 observations.

Step 1

set.seed(10)
x <- data.frame(x = rnorm(1000, 50, 2))
p <- 
  x %>% 
  ggplot(., aes(x)) +
  geom_histogram()

p

enter image description here

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

Step 2

pg <- ggplot_build(p)

pg$data[[1]]

As a check when I print the pg$data[[1]] I’d like to have only rows where count >= 50.

Thank you

>Solution :

library(ggplot2)

ggplot(x, aes(x=x, y = ifelse(..count.. > 50, ..count.., 0))) +
  geom_histogram(bins=30) 

enter image description here

With this code you can see the counts of the deleted bins:

library(ggplot2)

ggplot(x, aes(x=x, y = ifelse(..count.. > 50, ..count.., 0))) +
  geom_histogram(bins=30, fill="green", color="grey") +
  stat_bin(aes(label=..count..), geom="text", vjust = -0.7)

enter image description here

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading