Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

pROC – How to Get Confidence Intervals or Generate a Confusion Matrix

I used the pROC package to do an ROC analysis. It gave me the sensitivities, specificities, etc.

The journal is requesting 95% confidence intervals for every statistic provided. I see I can do that in the epiR package, but I have to give it a confusion matrix.

How do I use the threshold provided from the pROC to get a confusion matrix?

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

Sample data and code:

library(pROC)
library(tibble)

data<-tribble(
  ~death, ~score,
  0, 0.132,
  1, 0.19, 
  0, 0.03,
  1, 0.131,
  0, 0.02
)

roc<-roc(data$death, data$score, smoothed = TRUE,
              ci=TRUE, ci.alpha=0.95, stratified=FALSE,
              plot=TRUE, auc.polygon=TRUE, max.auc.polygon=TRUE, grid=TRUE,
              print.auc=TRUE, show.thres=TRUE)

coords(roc, x="best", ret=c("threshold", "specificity", "sensitivity", "accuracy",
                                 "precision", "recall", "tpr", "ppv", "fpr"))

>Solution :

To generate a confusion matrix, you first need to assign predicted outcomes (predicted death, predicted survived) according to a threshold. The AUC is calculated over every possible threshold in your data. In my example I have arbitrarily selected the second lowest threshold to generate the example

#first assign a threshold
thres <- roc$thresholds[2]

#assign labels to your data according to the threshold
data$predicted_death <- data$score > thres

#convert to character vector to facilitate interpretation
data$predicted_death <-ifelse(data$predicted_death==1, "predicted_dead", "predicted_alive")
data$death <- ifelse(data$death==1, "dead", "alive")

#count the true positives, false positives, false negatives and true negatives in a confusion matrix using the R function table()
cm <- table(data$death, data$predicted_death)

I would advise choosing a threshold to optimise both of sensitivity and specificity, such as the youden index.

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading