I want to calculate the IQR and length of all my numeric columns to use in the Freedman-Diaconis equation for calculating histogram binwidth and then use this in a ggplot.
I can do this, as follows, with iris:
datai = iris %>%
filter(Species == "virginica")%>%
select(-Species)
for (i in colnames(datai)) {
bw = (2* IQR(datai[,i], na.rm = T)/ length(datai[,i])^(1/3))
plot= ggplot(datai, (x= .data[[i]]))+
geom_histogram(binwidth = bw)
print(plot)
}
but with my own dataset I get an error which arises from IQR
#MWE
datah = structure(list(DBP = c(74.667, 78.6666666666667, 82, 73, 78.6666666666667,
68.6667), SBP = c(134, 114.666666666667, 126, 161, 126, 141.333
)), row.names = c(NA, -6L), class = c("tbl_df", "tbl", "data.frame"
))
for (i in colnames(datah)) {
bw = (2* IQR(datah[,i], na.rm = T) )/ length(datah[,i])^(1/3)
ggp3 <- ggplot(datah, aes(x = .data[[i]] )) +
geom_histogram( binwidth = bw)
print(ggp3)
}
The error is:
Error in quantile(as.numeric(x), c(0.25, 0.75), na.rm = na.rm, names = FALSE, :
'list' object cannot be coerced to type 'double'
>Solution :
The dataset is tibble whereas iris is data.frame, thus , for extraction into a vector works for iris where as with the new data it will still be a tibble with single column. Use [[ instead. According to ?IQR, the input x should be a numeric vector.
for (i in colnames(datah)) {
bw <- (2* IQR(datah[[i]], na.rm = TRUE) )/ length(datah[[i]])^(1/3)
ggp3 <- ggplot(datah, aes(x = .data[[i]] )) +
geom_histogram( binwidth = bw)
print(ggp3)
}
-output (last column)
