# mlpack_lars

## NAME

mlpack_lars - lars

## SYNOPSIS

mlpack_lars [-i string] [-m unknown] [-l double] [-L double] [-r string] [-t string] [-c bool] [-V bool] [-M unknown] [-o string] [-h -v]

## DESCRIPTION

An implementation of LARS: Least Angle Regression (Stagewise/laSso). This is a stage-wise homotopy-based algorithm for L1-regularized linear regression (LASSO) and L1+L2-regularized linear regression (Elastic Net).

This program is able to train a LARS/LASSO/Elastic Net model or load a model from file, output regression predictions for a test set, and save the trained model to a file. The LARS algorithm is described in more detail below:

Let X be a matrix where each row is a point and each column is a dimension, and let y be a vector of targets.

The Elastic Net problem is to solve

min_beta 0.5 || X * beta - y ||_2^2 + lambda_1 ||beta||_1 +
0.5 lambda_2 ||beta||_2^2

If lambda1 > 0 and lambda2 = 0, the problem is the LASSO. If lambda1 > 0 and lambda2 > 0, the problem is the Elastic Net. If lambda1 = 0 and lambda2 > 0, the problem is ridge regression. If lambda1 = 0 and lambda2 = 0, the problem is unregularized linear regression.

For efficiency reasons, it is not recommended to use this algorithm with ’--lambda1 (-l)’ = 0. In that case, use the ’linear_regression’ program, which implements both unregularized linear regression and ridge regression.

To train a LARS/LASSO/Elastic Net model, the ’--input_file (-i)’ and ’--responses_file (-r)’ parameters must be given. The ’--lambda1 (-l)’, ’--lambda2 (-L)’, and ’--use_cholesky (-c)’ parameters control the training options. A trained model can be saved with the ’--output_model_file (-M)’. If no training is desired at all, a model can be passed via the ’--input_model_file (-m)’ parameter.

The program can also provide predictions for test data using either the trained model or the given input model. Test points can be specified with the ’--test_file (-t)’ parameter. Predicted responses to the test points can be saved with the ’--output_predictions_file (-o)’ output parameter.

For example, the following command trains a model on the data ’data.csv’ and responses ’responses.csv’ with lambda1 set to 0.4 and lambda2 set to 0 (so, LASSO is being solved), and then the model is saved to ’lasso_model.bin’:

\$ lars --input_file data.csv --responses_file responses.csv --lambda1 0.4 --lambda2 0 --output_model_file lasso_model.bin

The following command uses the ’lasso_model.bin’ to provide predicted responses for the data ’test.csv’ and save those responses to ’test_predictions.csv’:

\$ lars --input_model_file lasso_model.bin --test_file test.csv --output_predictions_file test_predictions.csv

## OPTIONAL INPUT OPTIONS

--help (-h) [bool]

Default help info.

--info [string]

Get help on a specific module or option. Default value ’’.

--input_file (-i) [string]

Matrix of covariates (X). Default value ’’.

--input_model_file (-m) [unknown]

Trained LARS model to use. Default value ’’.

--lambda1 (-l) [double]

Regularization parameter for l1-norm penalty. Default value 0.

--lambda2 (-L) [double]

Regularization parameter for l2-norm penalty. Default value 0.

--responses_file (-r) [string]

Matrix of responses/observations (y). Default value ’’.

--test_file (-t) [string]

Matrix containing points to regress on (test points). Default value ’’.

--use_cholesky (-c) [bool]

Use Cholesky decomposition during computation rather than explicitly computing the full Gram matrix.

--verbose (-v) [bool]

Display informational messages and the full list of parameters and timers at the end of execution.

--version (-V) [bool]

Display the version of mlpack.

## OPTIONAL OUTPUT OPTIONS

--output_model_file (-M) [unknown]

Output LARS model. Default value ’’.

--output_predictions_file (-o) [string]

If --test_file is specified, this file is where the predicted responses will be saved. Default value ’’.