Skip to content

Latest commit

 

History

History

allknn

bench/allknn

This folder contains scripts that run all nearest neighbor searches in a number of libraries. For the most part, the scripts are very bare-bones. For example, they don't even output the results.

To run the scripts, you'll obviously first need to install the libraries. The /install folder in this repo contains scripts for installing all of these libraries. With all the libraries installed, just call the runtest.sh script with a single parameter that is the dataset to test on.

The table below provides a brief description of the libraries compared against.

Library Description
FLANN The Fast Library for Approximate Nearest Neighbor queries. This C++ library is the standard method for nearest neighbor in Matlab/Octave and the OpenCV computer vision toolkit.
Julia A popular new language designed from the ground up for fast data processing. Julia supports faster nearest neighbor queries using the KDTrees.jl package.
Langford's cover tree A reference implementation for the cover tree data structure created by John Langford. The implementation is in C, and the data structure is widely included in C/C++ machine learning libraries.
MLPack A C++ library for machine learning. MLPack was the first library to demonstrate the utility of generic programming in machine learning. The interface for nearest neighbor queries lets you use either a cover tree or kdtree.
R A popular language for statisticians. Nearest neighbor queries are implemented in the FNN package, which provides bindings to the C-based ANN library for kdtrees.
scikit-learn The Python machine learning toolkit. The documentation is very beginner friendly and easy to learn. The interface for nearest neighbor queries lets you use either a ball tree or kdtree to speed up the calculations. Both data structures were written in Cython.
Weka A Java data mining tool with a popular GUI frontend. Nearest neighbor queries in Weka are very, very slow for me and not remotely competitive with any of the libraries above.