Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

ROC Curves look like step functions for imbalanced data?

I’m making ROC curves for an imbalanced dataset, but they do not look like normal ROC curves at all. They look more like step functions (see image provided). From all the sources I can find, this should indicate that my machine learning algorithm is effective, but I’m still getting a lot of false positives. Why does my ROC curve so nicely shaped if my algorithm is not effective? Is there a better way to measure how ‘good’ this algorithm is? Thanks!

Step-Function ROC Curve

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

>Solution :

Sometimes, ROC curves are not the best way to analyze algorithms for imbalanced datasets because a ‘good’ ROC curve just has a high true positive rate in comparison to a low false positive rate, which sometimes is misleading for imbalanced datasets because a true positive rate can be high while still mislabeling because the majority class dominated the minority class (as you touched on).

Instead, try a PR (precision-recall) curve – PR curves are much better for imbalanced datasets because precision is not skewed by imbalanced data. See this link: PR Curves

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading