mlpack_det

NAME

mlpack_det - density estimation with density estimation trees

SYNOPSIS

mlpack_det [-h] [-v]

DESCRIPTION

This program performs a number of functions related to Density Estimation Trees. The optimal Density Estimation Tree (DET) can be trained on a set of data (specified by ’--training_file (-t)’) using cross-validation (with number of folds specified with the ’--folds (-f)’ parameter). This trained density estimation tree may then be saved with the ’--output_model_file (-M)’ output parameter.

The variable importances (that is, the feature importance values for each dimension) may be saved with the ’--vi_file (-i)’ output parameter, and the density estimates for each training point may be saved with the ’--training_set_estimates_file (-e)’ output parameter.

Enabling path printing for each node outputs the path from the root node to a leaf for each entry in the test set, or training set (if a test set is not provided). Strings like ’LRLRLR’ (indicating that traversal went to the left child, then the right child, then the left child, and so forth) will be output. If ’lr-id’ or ’id-lr’ are given as the ’--path_format (-p)’ parameter, then the ID (tag) of every node along the path will be printed after or before the L or R character indicating the direction of traversal, respectively.

This program also can provide density estimates for a set of test points, specified in the ’--test_file (-T)’ parameter. The density estimation tree used for this task will be the tree that was trained on the given training points, or a tree given as the parameter ’--input_model_file (-m)’. The density estimates for the test points may be saved using the ’--test_set_estimates_file (-E)’ output parameter.

OPTIONAL INPUT OPTIONS

--folds (-f) [int]

The number of folds of cross-validation to perform for the estimation (0 is LOOCV) Default value 10.

--help (-h) [bool]

Default help info.

--info [string]

Get help on a specific module or option. Default value ’’. --input_model_file (-m) [string] Trained density estimation tree to load. Default value ’’.

--max_leaf_size (-L) [int]

The maximum size of a leaf in the unpruned, fully grown DET. Default value 10.

--min_leaf_size (-l) [int]

The minimum size of a leaf in the unpruned, fully grown DET. Default value 5.

--path_format (-p) [string]

The format of path printing: ’lr’, ’id-lr’, or ’lr-id’. Default value ’lr’.

--skip_pruning (-s) [bool]

Whether to bypass the pruning process and output the unpruned tree only.

--test_file (-T) [string]

A set of test points to estimate the density of. Default value ’’. --training_file (-t) [string] The data set on which to build a density estimation tree. Default value ’’.

--verbose (-v) [bool]

Display informational messages and the full list of parameters and timers at the end of execution.

--version (-V) [bool]

Display the version of mlpack.

OPTIONAL OUTPUT OPTIONS

--output_model_file (-M) [string] Output to save trained density estimation tree to. Default value ’’. --tag_counters_file (-c) [string] The file to output the number of points that went to each leaf. Default value ’’.
--tag_file (-g) [string]

The file to output the tags (and possibly paths) for each sample in the test set. Default value ’’. --test_set_estimates_file (-E) [string] The output estimates on the test set from the final optimally pruned tree. Default value ’’. --training_set_estimates_file (-e) [string] The output density estimates on the training set from the final optimally pruned tree. Default value ’’.

--vi_file (-i) [string]

The output variable importance values for each feature. Default value ’’.

ADDITIONAL INFORMATION

ADDITIONAL INFORMATION

For further information, including relevant papers, citations, and theory, For further information, including relevant papers, citations, and theory, consult the documentation found at http://www.mlpack.org or included with your consult the documentation found at http://www.mlpack.org or included with your DISTRIBUTION OF MLPACK. DISTRIBUTION OF MLPACK.