mlpack.emst

emst(...)Fast Euclidean Minimum Spanning Tree

>>> from mlpack import emst

This program can compute the Euclidean minimum spanning tree of a set of input points using the dual-tree Boruvka algorithm.

The set to calculate the minimum spanning tree of is specified with the 'input' parameter, and the output may be saved with the 'output' output parameter.

The 'leaf_size' parameter controls the leaf size of the kd-tree that is used to calculate the minimum spanning tree, and if the 'naive' option is given, then brute-force search is used (this is typically much slower in low dimensions). The leaf size does not affect the results, but it may have some effect on the runtime of the algorithm.

For example, the minimum spanning tree of the input dataset 'data' can be calculated with a leaf size of 20 and stored as 'spanning_tree' using the following command:

>>> output = emst(input=data, leaf_size=20)

>>> spanning_tree = output['output']

The output matrix is a three-dimensional matrix, where each row indicates an edge. The first dimension corresponds to the lesser index of the edge; the second dimension corresponds to the greater index of the edge; and the third column corresponds to the distance between the two points.

## input options

- input (numpy matrix or arraylike, float dtype): [required] Input data matrix.
- copy_all_inputs (bool): If specified, all input parameters will be deep copied before the method is run. This is useful for debugging problems where the input parameters are being modified by the algorithm, but can slow down the code.
- leaf_size (int): Leaf size in the kd-tree. One-element leaves give the empirically best performance, but at the cost of greater memory requirements. Default value 1.
- naive (bool): Compute the MST using O(n^2) naive algorithm.
- verbose (bool): Display informational messages and the full list of parameters and timers at the end of execution.

## output options

The return value from the binding is a dict containing the following elements:

- output (numpy matrix, float dtype): Output data. Stored as an edge list.