An implementation of Neighborhood Components Analysis, both a linear dimensionality reduction technique and a distance learning technique. More...
Public Member Functions | |
NCA (const arma::mat &dataset, const arma::Row< size_t > &labels, MetricType metric=MetricType()) | |
Construct the Neighborhood Components Analysis object. More... | |
const arma::mat & | Dataset () const |
Get the dataset reference. More... | |
const arma::Row< size_t > & | Labels () const |
Get the labels reference. More... | |
void | LearnDistance (arma::mat &outputMatrix) |
Perform Neighborhood Components Analysis. More... | |
const OptimizerType & | Optimizer () const |
Get the optimizer. More... | |
OptimizerType & | Optimizer () |
An implementation of Neighborhood Components Analysis, both a linear dimensionality reduction technique and a distance learning technique.
The method seeks to improve k-nearest-neighbor classification on a dataset by scaling the dimensions. The method is nonparametric, and does not require a value of k. It works by using stochastic ("soft") neighbor assignments and using optimization techniques over the gradient of the accuracy of the neighbor assignments.
For more details, see the following published paper:
NCA | ( | const arma::mat & | dataset, |
const arma::Row< size_t > & | labels, | ||
MetricType | metric = MetricType() |
||
) |
Construct the Neighborhood Components Analysis object.
This simply stores the reference to the dataset and labels as well as the parameters for optimization before the actual optimization is performed.
dataset | Input dataset. |
labels | Input dataset labels. |
stepSize | Step size for stochastic gradient descent. |
maxIterations | Maximum iterations for stochastic gradient descent. |
tolerance | Tolerance for termination of stochastic gradient descent. |
shuffle | Whether or not to shuffle the dataset during SGD. |
metric | Instantiated metric to use. |
|
inline |
|
inline |
void LearnDistance | ( | arma::mat & | outputMatrix | ) |
Perform Neighborhood Components Analysis.
The output distance learning matrix is written into the passed reference. If LearnDistance() is called with an outputMatrix which has the correct size (dataset.n_rows x dataset.n_rows), that matrix will be used as the starting point for optimization.
output_matrix | Covariance matrix of Mahalanobis distance. |
|
inline |