mlpack
gitmaster

Introduction
mlpack uses Armadillo matrices for matrix support. Armadillo is a fast C++ matrix library which makes use of advanced template techniques to provide the fastest possible matrix operations.
Documentation on Armadillo can be found on their website:
http://arma.sourceforge.net/docs.html
Nonetheless, there are a few further caveats for mlpack Armadillo usage.
Columnmajor Matrices
Armadillo matrices are stored in a columnmajor format; this means that on disk, each column is located in contiguous memory.
This means that, for the vast majority of machine learning methods, it is faster to store observations as columns and dimensions as rows. This is counter to most standard machine learning texts!
Major implications of this are for linear algebra. For instance, the covariance of a matrix is typically
but for a columnwise matrix, it is
and this is very important to keep in mind! If your mlpack code is not working, this may be a factor in why.
Loading Matrices
mlpack provides a data::Load() and data::Save() function, which should be used instead of Armadillo's loading and saving functions.
Most machine learning data is stored in rowmajor format; a CSV, for example, will generally have one observation per line and each column will correspond to a dimension.
The data::Load() and data::Save() functions transpose the matrix upon loading, meaning that the following CSV:
is actually loaded with 5 rows and 13 columns, not 13 rows and 5 columns like the CSV is written. More information on mlpack's loading functionality can be found in File formats and loading data in mlpack.
This is important to remember!
Generated by 1.8.13