mlpack_allknn

NAME

mlpack_allknn - k-nearest-neighbors

SYNOPSIS

mlpack_allknn [-h] [-v]

DESCRIPTION

This program will calculate the k-nearest-neighbors of a set of points using kd-trees or cover trees (cover tree support is experimental and may be slow). You may specify a separate set of reference points and query points, or just a reference set which will be used as both the reference and query set.

For example, the following will calculate the 5 nearest neighbors of eachpoint in ’input.csv’ and store the distances in ’distances.csv’ and the neighbors in the file ’neighbors.csv’:

$ mlpack_knn --k=5 --reference_file=input.csv --distances_file=distances.csv --neighbors_file=neighbors.csv

The output files are organized such that row i and column j in the neighbors output file corresponds to the index of the point in the reference set which is the i’th nearest neighbor from the point in the query set with index j. Row i and column j in the distances output file corresponds to the distance between those two points.

OPTIONAL INPUT OPTIONS

--algorithm (-a) [string]

Type of neighbor search: ’naive’, ’single_tree’, ’dual_tree’, ’greedy’. Default value ’dual_tree’.

--epsilon (-e) [double]

If specified, will do approximate nearest neighbor search with given relative error. Default value 0.

--help (-h) [bool]

Default help info. Default value 0.

--info [string]

Get help on a specific module or option. Default value ’’. --input_model_file (-m) [string] Pre-trained kNN model. Default value ’’.

--k (-k) [int]

Number of nearest neighbors to find. Default value 0.

--leaf_size (-l) [int]

Leaf size for tree building (used for kd-trees, vp trees, random projection trees, UB trees, R trees, R* trees, X trees, Hilbert R trees, R+ trees, R++ trees, spill trees, and octrees). Default value 20.

--naive (-N) [bool]

(Deprecated) If true, O(n^2) naive mode is used for computation. Will be removed in mlpack 3.0.0. Use ’--algorithm naive’ instead. Default value 0.

--query_file (-q) [string]

Matrix containing query points (optional). Default value ’’.

--random_basis (-R) [bool]

Before tree-building, project the data onto a random orthogonal basis. Default value 0. --reference_file (-r) [string] Matrix containing the reference dataset. Default value ’’.

--rho (-b) [double]

Balance threshold (only valid for spill trees). Default value 0.7.

--seed (-s) [int]

Random seed (if 0, std::time(NULL) is used). Default value 0.

--single_mode (-S) [bool]

(Deprecated) If true, single-tree search is used (as opposed to dual-tree search). Will be removed in mlpack 3.0.0. Use ’--algorithm single_tree’ instead. Default value 0.

--tau (-u) [double]

Overlapping size (only valid for spill trees). Default value 0.

--tree_type (-t) [string]

Type of tree to use: ’kd’, ’vp’, ’rp’, ’max-rp’, ’ub’, ’cover’, ’r’, ’r-star’, ’x’, ’ball’, ’hilbert-r’, ’r-plus’, ’r-plus-plus’, ’spill’, ’oct’. Default value ’kd’. --true_distances_file (-D) [string] Matrix of true distances to compute the effective error (average relative error) (it is printed when -v is specified). Default value ’’. --true_neighbors_file (-T) [string] Matrix of true neighbors to compute the recall (it is printed when -v is specified). Default value ’’.

--verbose (-v) [bool]

Display informational messages and the full list of parameters and timers at the end of execution. Default value 0.

--version (-V) [bool]

Display the version of mlpack. Default value

0.

OPTIONAL OUTPUT OPTIONS

--distances_file (-d) [string] Matrix to output distances into. Default value ’’. --neighbors_file (-n) [string] Matrix to output neighbors into. Default value ’’. --output_model_file (-M) [string] If specified, the kNN model will be output here. Default value ’’.

ADDITIONAL INFORMATION

ADDITIONAL INFORMATION

For further information, including relevant papers, citations, and theory, For further information, including relevant papers, citations, and theory, consult the documentation found at http://www.mlpack.org or included with your consult the documentation found at http://www.mlpack.org or included with your DISTRIBUTION OF MLPACK. DISTRIBUTION OF MLPACK.