🔗 NMF
The NMF
class implements non-negative matrix factorization, a technique to
decompose a large (potentially sparse) matrix V
into two smaller matrices W
and H
, such that V ~= W * H
, and W
and H
only contain nonnegative
elements. This technique may be used for dimensionality reduction, or as part
of a recommender system.
The NMF
class allows fully configurable behavior via template
parameters. For more general
matrix factorization strategies, see the AMF
(alternating matrix
factorization) class documentation.
Simple usage example:
// Create a random sparse matrix (V) of size 10x100, with 15% nonzeros.
arma::sp_mat V;
V.sprandu(100, 100, 0.15);
// W and H will be low-rank matrices of size 100x10 and 10x100.
arma::mat W, H;
mlpack::NMF nmf; // Step 1: create object.
double residue = nmf.Apply(V, 10, W, H); // Step 2: apply NMF to decompose V.
// Now print some information about the factorized matrices.
std::cout << "W has size: " << W.n_rows << " x " << W.n_cols << "."
<< std::endl;
std::cout << "H has size: " << H.n_rows << " x " << H.n_cols << "."
<< std::endl;
std::cout << "RMSE of reconstructed matrix: "
<< arma::norm(V - W * H, "fro") / std::sqrt(V.n_elem) << "." << std::endl;
Quick links:
- Constructors: create
NMF
objects. Apply()
: applyNMF
decomposition to data.- Examples of simple usage and links to detailed example projects.
- Template parameters for using different update rules, initialization strategies, and termination criteria.
- Advanced template examples of use with custom template parameters.
See also:
AMF
: alternating matrix factorizationSparseCoding
- mlpack transformations
- Non-negative matrix factorization on Wikipedia
- Learning the parts of objects by non-negative matrix factorization (original NMF paper, pdf)
🔗 Constructors
nmf = NMF()
- Create an
NMF
object. - The rank of the decomposition is specified in the call to
Apply()
.
- Create an
nmf = NMF(SimpleResidueTermination(minResidue=1e-5, maxIterations=10000))
- Create an NMF object with custom termination parameters.
minResidue
(adouble
) specifies the minimum difference of the norm ofW * H
between iterations for termination.maxIterations
specifies the maximum number of iterations before decomposition terminates.
🔗 Applying Decompositions
double residue = nmf.Apply(V, rank, W, H)
- Decompose the matrix
V
into two non-negative matricesW
andH
with rankrank
. W
will be set to sizeV.n_rows
xrank
.H
will be set to sizerank
xV.n_cols
.W
andH
are initialized randomly using the Acol initialization strategy; i.e., each column ofW
is an average of 5 random columns ofV
, andH
is initialized uniformly randomly.- The residue (change in the norm of
W * H
between iterations) is returned.
- Decompose the matrix
Notes:
-
Low values of
rank
will give smaller matricesW
andH
, but the decomposition will be less accurate. Larger values ofrank
will give more accurate decompositions, but will take longer to compute. Every problem is different, sorank
must be specified manually. -
The expression
W * H
can be used to reconstruct the matrixV
. -
Custom behavior, such as custom initialization of
W
andH
, different or custom termination rules, and different update rules are discussed in the advanced functionality section.
Apply()
Parameters:
name | type | description |
---|---|---|
V |
arma::sp_mat or arma::mat |
Input matrix to be factorized. |
rank |
size_t |
Rank of decomposition; lower is smaller, higher is more accurate. |
W |
arma::mat |
Output matrix in which W will be stored. |
H |
arma::mat |
Output matrix in which H will be stored. |
Note: Matrices with different element types can be used for V
, W
, and
H
; e.g., arma::fmat
. While V
can be sparse or dense, W
and H
must be
dense matrices.
🔗 Simple Examples
See also the simple usage example for a trivial use of
NMF
.
Decompose a dense matrix with custom termination parameters.
// Create a low-rank V matrix by multiplying together two random matrices.
arma::mat V = arma::randu<arma::mat>(500, 25) *
arma::randn<arma::mat>(25, 5000);
// Create the NMF object with a looser tolerance of 1e-3 and a maximum of 100
// iterations only.
mlpack::NMF nmf(mlpack::SimpleResidueTermination(1e-3, 500));
arma::mat W, H;
// Decompose with a rank of 25.
// W will have size 500 x 25, and H will have size 25 x 5000.
const double residue = nmf.Apply(V, 25, W, H);
std::cout << "Residue of decomposition: " << residue << "." << std::endl;
// Compute RMSE of decomposition.
const double rmse = arma::norm(V - W * H, "fro") / std::sqrt(V.n_elem);
std::cout << "RMSE of decomposition: " << rmse << "." << std::endl;
Decompose the sparse MovieLens dataset using a rank-12 decomposition and float
element type.
// See https://datasets.mlpack.org/movielens-100k.csv.
arma::sp_fmat V;
mlpack::data::Load("movielens-100k.csv", V, true);
// Create the NMF object.
mlpack::NMF nmf;
arma::fmat W, H;
// Decompose the Movielens dataset with rank 12.
const double residue = nmf.Apply(V, 12, W, H);
std::cout << "Residue of MovieLens decomposition: " << residue << "."
<< std::endl;
// Compute RMSE of decomposition.
const double rmse = arma::norm(V - W * H, "fro") / std::sqrt(V.n_elem);
std::cout << "RMSE of decomposition: " << rmse << "." << std::endl;
Compare quality of decompositions of MovieLens with different ranks.
// See https://datasets.mlpack.org/movielens-100k.csv.
arma::sp_mat V;
mlpack::data::Load("movielens-100k.csv", V, true);
// Create the NMF object.
mlpack::NMF nmf;
arma::mat W, H;
for (size_t rank = 10; rank <= 100; rank += 15)
{
// Decompose with the given rank.
const double residue = nmf.Apply(V, rank, W, H);
const double rmse = arma::norm(V - W * H, "fro") / std::sqrt(V.n_elem);
std::cout << "RMSE for rank-" << rank << " decomposition: " << rmse << "."
<< std::endl;
}
🔗 Advanced Functionality: Template Parameters
The NMF
class has three template parameters that can be used for custom
behavior. The full signature of the class is:
NMF<TerminationPolicyType, InitializationRuleType, UpdateRuleType>
TerminationPolicyType
: the strategy used to choose when to terminate NMF.InitializationRuleType
: the strategy used to choose the initialW
andH
matrices.UpdateRuleType
: the update rules used for NMF:NMFMultiplicativeDistanceUpdate
: update rule that ensure the Frobenius norm of the reconstruction error is decreasing at each iteration.NMFMultiplicativeDivergenceUpdate
: update rules that ensure Kullback-Leibler divergence is decreasing at each iteration.NMFALSUpdate
: alternating least-squares projections forW
andH
.- For custom update rules, use the more general
AMF
class.
🔗 Advanced Functionality: TerminationPolicyType
- Specifies the strategy to use to choose when to stop the NMF algorithm.
- An instantiated
TerminationPolicyType
can be passed to the NMF constructor. - The following choices are available for drop-in usage:
SimpleResidueTermination
(default):
- Terminates when a maximum number of iterations is reached, or when the
residue (change in norm of
W * H
between iterations) is sufficiently small. - Constructor:
SimpleResidueTermination(minResidue=1e-5, maxIterations=10000)
minResidue
(adouble
) specifies the sufficiently small residue for termination.maxIterations
(asize_t
) specifies the maximum number of iterations.
nmf.Apply()
will return the residue of the last iteration.
MaxIterationTermination
:
- Terminates when the maximum number of iterations is reached.
- No other condition is checked.
- Constructor:
MaxIterationTermination(maxIterations=1000)
nmf.Apply()
will return the number of iterations performed.
SimpleToleranceTermination<MatType, WHMatType>
:
- Terminates when the nonzero residual decreases a sufficiently small relative
amount between iterations (e.g.
(lastNonzeroResidual - nonzeroResidual) / lastNonzeroResidual
is below a threshold), or when the maximum number of iterations is reached. - The residual must remain below the threshold for a specified number of iterations.
- The nonzero residual is defined as the root of the sum of squared elements in
the reconstruction error matrix
(V - WH)
, limited to locations whereV
is nonzero. - Constructor:
SimpleToleranceTermination<MatType, WHMatType>(tol=1e-5, maxIter=10000, extraSteps=3)
MatType
should be set to the type ofV
(seeApply()
Parameters).WHMatType
(defaultarma::mat
) should be set to the type ofW
andH
(seeApply()
Parameters).tol
(adouble
) specifies the relative nonzero residual tolerance for convergence.maxIter
(asize_t
) specifies the maximum number of iterations before termination.extraSteps
(asize_t
) specifies the number of iterations where the relative nonzero residual must be below the tolerance for convergence.
- The best
W
andH
matrices (according to the nonzero residual) from the finalextraSteps
iterations are returned bynmf.Apply()
. nmf.Apply()
will return the nonzero residue of the iteration corresponding to the bestW
andH
matrices.
ValidationRMSETermination<MatType>
:
- Holds out a validation set of nonzero elements from
V
, and terminates when the RMSE (root mean squared error) on this validation set is sufficiently small between iterations. - The validation RMSE must remain below the threshold for a specified number of iterations.
MatType
should be set to the type ofV
(seeApply()
Parameters).- Constructor:
ValidationRMSETermination<MatType>(V, numValPoints, tol=1e-5, maxIter=10000, extraSteps=3)
V
is the matrix to be decomposed byApply()
. This will be modified (validation elements will be removed).numValPoints
(asize_t
) specifies number of test points fromV
to be held out.tol
(adouble
) specifies the relative tolerance for the validation RMSE for termination.maxIter
(asize_t
) specifies the maximum number of iterations before termination.extraSteps
(asize_t
) specifies the number of iterations where the validation RMSE must be below the tolerance for convergence.
- The best
W
andH
matrices (according to the validation RMSE) from the finalextraSteps
iterations are returned bynmf.Apply()
. nmf.Apply()
will return the best validation RMSE.
Custom policies:
- A custom class for termination behavior must implement the following functions.
// You can use this as a starting point for implementation.
class CustomTerminationPolicy
{
public:
// Initialize the termination policy for the given matrix V. (It is okay to
// do nothing.) This function is called at the beginning of Apply().
//
// If the termination policy requires V to compute convergence, store a
// reference or pointer to it in this function.
template<typename MatType>
void Initialize(const MatType& V);
// Check if convergence has occurred for the given W and H matrices. Return
// `true` if so.
//
// Note that W and H may have different types than V (i.e. V may be sparse,
// and W and H must be dense.)
template<typename WHMatType>
bool IsConverged(const MatType& H, const MatType& W);
// Return the value that should be returned for the `nmf.Apply()` function
// when convergence has been reached. This is called at the end of
// `nmf.Apply()`.
const double Index();
// Return the number of iterations that have been completed. This is called
// at the end of `nmf.Apply()`.
const size_t Iteration();
};
🔗 Advanced Functionality: InitializationRuleType
- Specifies the strategy to use to initialize
W
andH
at the beginning of the NMF algorithm. - An initialized
InitializationRuleType
can be passed to the following constructor:nmf = NMF(terminationPolicy, initializationRule)
- The following choices are available for drop-in usage:
RandomAcolInitialization<N>
(default):
- Initialize
W
by averagingN
randomly chosen columns ofV
. - Initialize
H
as uniform random in the range[0, 1]
. - The default value for
N
is 5. - See also the paper describing the strategy.
NoInitialization
:
- When
nmf.Apply(V, rank, W, H)
, the existing values ofW
andH
will be used. - If
W
is not of sizeV.n_rows
xrank
, or ifH
is not of sizerank
xV.n_cols
, astd::invalid_argument
exception will be thrown.
GivenInitialization<MatType>
:
- Set
W
and/orH
to the given matrices whenApply()
is called. MatType
should be set to the type ofW
orH
(defaultarma::mat
); seeApply()
Parameters.- Constructors:
GivenInitialization<MatType>(W, H)
- Specify both initial
W
andH
matrices.
- Specify both initial
GivenInitialization<MatType>(M, isW=true)
- If
isW
istrue
, then set initialW
toM
. - If
isW
isfalse
, then set initialH
toM
. - This constructor is meant to only be used with
MergeInitialization
(below).
- If
RandomAMFInitialization
:
- Initialize
W
andH
as uniform random in the range[0, 1]
.
AverageInitialization
:
- Initialize each element of
W
andH
to the square root of the average value ofV
, adding uniform random noise in the range[0, 1]
.
MergeInitialization<WRule, HRule>
:
- Use two different initialization rules, one for
W
(WRule
) and one forH
(HRule
). - Constructors:
MergeInitialization<WRule, HRule>()
- Create the merge initialization with default-constructed rules for
W
andH
.
- Create the merge initialization with default-constructed rules for
MergeInitialization<WRule, HRule>(wRule, hRule)
- Create the merge initialization with instantiated rules for
W
andH
. wRule
andhRule
will be copied.
- Create the merge initialization with instantiated rules for
- Any
WRule
andHRule
classes must implement theInitializeOne()
function.
Custom rules:
- A custom class for initializing
W
andH
must implement the following functions.
// You can use this as a starting point for implementation.
class CustomInitialization
{
public:
// Initialize the W and H matrices, given V and the rank of the decomposition.
// This is called at the start of `Apply()`.
//
// Note that `MatType` may be different from `WHMatType`; e.g., `V` could be
// sparse, but `W` and `H` must be dense.
template<typename MatType, typename WHMatType>
void Initialize(const MatType& V,
const size_t rank,
WHMatType& W,
WHMatType& H);
// Initialize one of the W or H matrices, given V and the rank of the
// decomposition.
//
// If `isW` is `true`, then `M` should be treated as though it is `W`;
// if `isW` is `false`, then `M` should be treated as thought it is `H`.
//
// This function only needs to be implemented if it is intended to use the
// custom initialization strategy with `MergeInitialization`.
template<typename MatType, typename WHMatType>
void InitializeOne(const MatType& V,
const size_t rank,
WHMatType& M,
const bool isW);
};
🔗 Advanced Functionality Examples
Use a pre-specified initialization for W
and H
.
// See https://datasets.mlpack.org/movielens-100k.csv.
arma::sp_mat V;
mlpack::data::Load("movielens-100k.csv", V, true);
arma::mat W, H;
// Pre-initialize W and H.
// W will be filled with random values from a normal distribution.
// H will be filled with 1s.
W.randn(V.n_rows, 15);
H.set_size(15, V.n_cols);
H.fill(0.2);
mlpack::NMF<mlpack::SimpleResidueTermination, mlpack::NoInitialization> nmf;
const double residue = nmf.Apply(V, 15, W, H);
const double rmse = arma::norm(V - W * H, "fro") / std::sqrt(V.n_elem);
std::cout << "RMSE of NMF decomposition with pre-specified W and H: " << rmse
<< "." << std::endl;
Use ValidationRMSETermination
to decompose the MovieLens dataset until the
RMSE of the held-out validation set is sufficiently low.
// See https://datasets.mlpack.org/movielens-100k.csv.
arma::sp_mat V;
mlpack::data::Load("movielens-100k.csv", V, true);
arma::mat W, H;
// Create a ValidationRMSETermination class that will hold out 3k points from V.
// This will remove 3000 nonzero entries from V.
mlpack::ValidationRMSETermination<arma::sp_mat> t(V, 3000);
// Create the NMF object with the instantiated termination policy.
mlpack::NMF<mlpack::ValidationRMSETermination<arma::sp_mat>> nmf(t);
// Perform NMF with a rank of 20.
// Note the RMSE returned here is the RMSE on the validation set.
const double rmse = nmf.Apply(V, 20, W, H);
const double rmseTrain = arma::norm(V - W * H, "fro") / std::sqrt(V.n_elem);
std::cout << "Training RMSE: " << rmseTrain << "." << std::endl;
std::cout << "Validation RMSE: " << rmse << "." << std::endl;
Use all three sets of NMF update rules and compare the RMSE on a held-out validation set.
// See https://datasets.mlpack.org/movielens-100k.csv.
arma::sp_mat V;
mlpack::data::Load("movielens-100k.csv", V, true);
arma::mat W1, W2, W3;
arma::mat H1, H2, H3;
// Create a ValidationRMSETermination class that will hold out 3k points from V.
// This will remove 3000 nonzero entries from V.
mlpack::ValidationRMSETermination<arma::sp_mat> t(V, 3000);
// Multiplicative distance update rule.
mlpack::NMF<mlpack::ValidationRMSETermination<arma::sp_mat>,
mlpack::RandomAcolInitialization<5>,
mlpack::NMFMultiplicativeDistanceUpdate> nmf1(t);
// Multiplicative divergence update rule.
mlpack::NMF<mlpack::ValidationRMSETermination<arma::sp_mat>,
mlpack::RandomAcolInitialization<5>,
mlpack::NMFMultiplicativeDivergenceUpdate> nmf2(t);
// Alternating least squares update rule.
mlpack::NMF<mlpack::ValidationRMSETermination<arma::sp_mat>,
mlpack::RandomAcolInitialization<5>,
mlpack::NMFALSUpdate> nmf3(t);
const double rmse1 = nmf1.Apply(V, 15, W1, H1);
const double rmse2 = nmf2.Apply(V, 15, W2, H2);
const double rmse3 = nmf3.Apply(V, 15, W3, H3);
// Print the RMSEs.
std::cout << "Mult. dist. update RMSE: " << rmse1 << "." << std::endl;
std::cout << "Mult. div. update RMSE: " << rmse2 << "." << std::endl;
std::cout << "ALS update RMSE: " << rmse3 << "." << std::endl;
Use a custom termination policy that sets a limit on how long NMF is allowed to take. First, we define the termination policy:
class CustomTimeTermination
{
public:
CustomTimeTermination(const double totalAllowedTime) :
totalAllowedTime(totalAllowedTime) { }
template<typename MatType>
void Initialize(const MatType& /* V */)
{
totalTime = 0.0;
iteration = 0;
c.tic();
}
template<typename WHMatType>
bool IsConverged(const WHMatType& /* W */, const WHMatType& /* H */)
{
totalTime += c.toc();
c.tic();
++iteration;
return (totalTime > totalAllowedTime);
}
const double Index() const { return totalTime; }
const size_t Iteration() const { return iteration; }
private:
double totalAllowedTime;
double totalTime;
size_t iteration;
arma::wall_clock c; // used for convenient timing
};
Then we can use it in the test program:
// See https://datasets.mlpack.org/movielens-100k.csv.
arma::sp_fmat V;
mlpack::data::Load("movielens-100k.csv", V, true);
CustomTimeTermination t(5 /* seconds */);
mlpack::NMF<CustomTimeTermination> nmf(t);
arma::fmat W, H;
const double actualTime = nmf.Apply(V, 10, W, H);
const double rmse = arma::norm(V - W * H, "fro") / std::sqrt(V.n_elem);
std::cout << "Actual time used for decomposition: " << actualTime << "."
<< std::endl;
std::cout << "RMSE after ~5 seconds: " << rmse << "." << std::endl;