Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

scikit-learn: fitting KNeighborsClassifier without labels

I’m trying to fit a simple KNN classifier, and wanted to use the scikit-learn implementation in order to benefit from their efficient implementation (multiprocessing, tree-based algorithms).

However, what I want to get as a result is just the list of distances and nearest neighbours for each data point, rather than the predicted label.
I will then compute the label separately in a non-standard way.

The kneighbors method seems exactly what I need, however I cannot call it without fitting the model with fit first. The issue is, fit() requires the labels (y) as a parameter.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

Is there a way to achieve what I’m after? Perhaps I can pass fake labels in the fit() method – is there any issues I’m missing by doing this? E.g. is this going to affect the results (of the computed distances and list of nearest neighbours for each datapoint) in any way? I wouldn’t expect so but I’m not familiar with the workings of the scikit-learn implementation.

>Solution :

There is another algorithm for what you desire: NearestNeighbors.

This algorithm is unsupervised (you don’t need the y labels); moreover, there is one method (kneighbors) that calculates distances to points and which sample is.

Check the link, it is quite clear.

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading