A class that represents a Hidden Markov Model with an arbitrary type of emission distribution. More...
Public Member Functions  
HMM (const size_t states=0, const Distribution emissions=Distribution(), const double tolerance=1e5)  
Create the Hidden Markov Model with the given number of hidden states and the given default distribution for emissions. More...  
HMM (const arma::vec &initial, const arma::mat &transition, const std::vector< Distribution > &emission, const double tolerance=1e5)  
Create the Hidden Markov Model with the given initial probability vector, the given transition matrix, and the given emission distributions. More...  
size_t  Dimensionality () const 
Get the dimensionality of observations. More...  
size_t &  Dimensionality () 
Set the dimensionality of observations. More...  
const std::vector< Distribution > &  Emission () const 
Return the emission distributions. More...  
std::vector< Distribution > &  Emission () 
Return a modifiable emission probability matrix reference. More...  
double  EmissionLogLikelihood (const arma::vec &emissionLogProb, double &logLikelihood, arma::vec &forwardLogProb) const 
Compute the loglikelihood of the given emission probability up to time t, storing the result in logLikelihood. More...  
double  EmissionLogScaleFactor (const arma::vec &emissionLogProb, arma::vec &forwardLogProb) const 
Compute the log of the scaling factor of the given emission probability at time t. More...  
double  Estimate (const arma::mat &dataSeq, arma::mat &stateProb, arma::mat &forwardProb, arma::mat &backwardProb, arma::vec &scales) const 
Estimate the probabilities of each hidden state at each time step for each given data observation, using the ForwardBackward algorithm. More...  
double  Estimate (const arma::mat &dataSeq, arma::mat &stateProb) const 
Estimate the probabilities of each hidden state at each time step of each given data observation, using the ForwardBackward algorithm. More...  
void  Filter (const arma::mat &dataSeq, arma::mat &filterSeq, size_t ahead=0) const 
HMM filtering. More...  
void  Generate (const size_t length, arma::mat &dataSequence, arma::Row< size_t > &stateSequence, const size_t startState=0) const 
Generate a random data sequence of the given length. More...  
const arma::vec &  Initial () const 
Return the vector of initial state probabilities. More...  
arma::vec &  Initial () 
Modify the vector of initial state probabilities. More...  
template < typename Archive >  
void  load (Archive &ar, const uint32_t version) 
Load the object. More...  
double  LogEstimate (const arma::mat &dataSeq, arma::mat &stateLogProb, arma::mat &forwardLogProb, arma::mat &backwardLogProb, arma::vec &logScales) const 
Estimate the probabilities of each hidden state at each time step for each given data observation, using the ForwardBackward algorithm. More...  
double  LogLikelihood (const arma::mat &dataSeq) const 
Compute the loglikelihood of the given data sequence. More...  
double  LogLikelihood (const arma::vec &data, double &logLikelihood, arma::vec &forwardLogProb) const 
Compute the loglikelihood of the given data up to time t, storing the result in logLikelihood. More...  
double  LogScaleFactor (const arma::vec &data, arma::vec &forwardLogProb) const 
Compute the log of the scaling factor of the given data at time t. More...  
double  Predict (const arma::mat &dataSeq, arma::Row< size_t > &stateSeq) const 
Compute the most probable hidden state sequence for the given data sequence, using the Viterbi algorithm, returning the loglikelihood of the most likely state sequence. More...  
template < typename Archive >  
void  save (Archive &ar, const uint32_t version) const 
Save the object. More...  
void  Smooth (const arma::mat &dataSeq, arma::mat &smoothSeq) const 
HMM smoothing. More...  
double  Tolerance () const 
Get the tolerance of the BaumWelch algorithm. More...  
double &  Tolerance () 
Modify the tolerance of the BaumWelch algorithm. More...  
double  Train (const std::vector< arma::mat > &dataSeq) 
Train the model using the BaumWelch algorithm, with only the given unlabeled observations. More...  
void  Train (const std::vector< arma::mat > &dataSeq, const std::vector< arma::Row< size_t > > &stateSeq) 
Train the model using the given labeled observations; the transition and emission matrices are directly estimated. More...  
const arma::mat &  Transition () const 
Return the transition matrix. More...  
arma::mat &  Transition () 
Return a modifiable transition matrix reference. More...  
Protected Member Functions  
void  Backward (const arma::mat &dataSeq, const arma::vec &logScales, arma::mat &backwardLogProb, arma::mat &logProbs) const 
The Backward algorithm (part of the ForwardBackward algorithm). More...  
void  Forward (const arma::mat &dataSeq, arma::vec &logScales, arma::mat &forwardLogProb, arma::mat &logProbs) const 
The Forward algorithm (part of the ForwardBackward algorithm). More...  
arma::vec  ForwardAtT0 (const arma::vec &emissionLogProb, double &logScales) const 
Given emission probabilities, computes forward probabilities at time t=0. More...  
arma::vec  ForwardAtTn (const arma::vec &emissionLogProb, double &logScales, const arma::vec &prevForwardLogProb) const 
Given emission probabilities, computes forward probabilities for time t>0. More...  
Protected Attributes  
std::vector< Distribution >  emission 
Set of emission probability distributions; one for each state. More...  
arma::mat  logTransition 
Transition probability matrix. No need to be mutable in mlpack 4.0. More...  
arma::mat  transitionProxy 
A proxy variable in linear space for logTransition. More...  
A class that represents a Hidden Markov Model with an arbitrary type of emission distribution.
This HMM class supports training (supervised and unsupervised), prediction of state sequences via the Viterbi algorithm, estimation of state probabilities, generation of random sequences, and calculation of the loglikelihood of a given sequence.
The template parameter, Distribution, specifies the distribution which the emissions follow. The class should implement the following functions:
See the mlpack::distribution::DiscreteDistribution class for an example. One would use the DiscreteDistribution class when the observations are nonnegative integers. Other distributions could be Gaussians, a mixture of Gaussians (GMM), or any other probability distribution implementing the four Distribution functions.
Usage of the HMM class generally involves either training an HMM or loading an alreadyknown HMM and taking probability measurements of sequences. Example code for supervised training of a Gaussian HMM (that is, where the emission output distribution is a single Gaussian for each hidden state) is given below.
Once initialized, the HMM can evaluate the probability of a certain sequence (with LogLikelihood()), predict the most likely sequence of hidden states (with Predict()), generate a sequence (with Generate()), or estimate the probabilities of each state for a sequence of observations (with Train()).
Distribution  Type of emission distribution for this HMM. 
HMM  (  const size_t  states = 0 , 
const Distribution  emissions = Distribution() , 

const double  tolerance = 1e5 

) 
Create the Hidden Markov Model with the given number of hidden states and the given default distribution for emissions.
The dimensionality of the observations is taken from the emissions variable, so it is important that the given default emission distribution is set with the correct dimensionality. Alternately, set the dimensionality with Dimensionality(). Optionally, the tolerance for convergence of the BaumWelch algorithm can be set.
By default, the transition matrix and initial probability vector are set to contain equal probability for each state.
states  Number of states. 
emissions  Default distribution for emissions. 
tolerance  Tolerance for convergence of training algorithm (BaumWelch). 
HMM  (  const arma::vec &  initial, 
const arma::mat &  transition,  
const std::vector< Distribution > &  emission,  
const double  tolerance = 1e5 

) 
Create the Hidden Markov Model with the given initial probability vector, the given transition matrix, and the given emission distributions.
The dimensionality of the observations of the HMM are taken from the given emission distributions. Alternately, the dimensionality can be set with Dimensionality().
The initial state probability vector should have length equal to the number of states, and each entry represents the probability of being in the given state at time T = 0 (the beginning of a sequence).
The transition matrix should be such that T(i, j) is the probability of transition to state i from state j. The columns of the matrix should sum to 1.
The emission matrix should be such that E(i, j) is the probability of emission i while in state j. The columns of the matrix should sum to 1.
Optionally, the tolerance for convergence of the BaumWelch algorithm can be set.
initial  Initial state probabilities. 
transition  Transition matrix. 
emission  Emission distributions. 
tolerance  Tolerance for convergence of training algorithm (BaumWelch). 

protected 
The Backward algorithm (part of the ForwardBackward algorithm).
Computes backward probabilities for each state for each observation in the given data sequence, using the scaling factors found (presumably) by Forward(). The returned matrix has rows equal to the number of hidden states and columns equal to the number of observations.
dataSeq  Data sequence to compute probabilities for. 
logScales  Vector of log of scaling factors. 
backwardLogProb  Matrix in which backward probabilities will be saved. 
Referenced by HMM< mlpack::distribution::DiscreteDistribution >::Tolerance().

inline 

inline 

inline 

inline 
double EmissionLogLikelihood  (  const arma::vec &  emissionLogProb, 
double &  logLikelihood,  
arma::vec &  forwardLogProb  
)  const 
Compute the loglikelihood of the given emission probability up to time t, storing the result in logLikelihood.
This is meant for incremental or streaming computation of the loglikelihood of a sequence. For the first data point, provide an empty forwardLogProb vector.
emissionLogProb  emission probability at time t. 
logLikelihood  Loglikelihood of the given sequence of emission probability up to time t1. This will be overwritten with the loglikelihood of the given emission probability up to time t. 
forwardLogProb  Vector in which forward probabilities will be saved. Passing forwardLogProb as an empty vector indicates the start of the sequence (i.e. time t=0). 
double EmissionLogScaleFactor  (  const arma::vec &  emissionLogProb, 
arma::vec &  forwardLogProb  
)  const 
Compute the log of the scaling factor of the given emission probability at time t.
To calculate the loglikelihood for the whole sequence, accumulate log scale over the entire sequence This is meant for incremental or streaming computation of the loglikelihood of a sequence. For the first data point, provide an empty forwardLogProb vector.
emissionLogProb  emission probability at time t. 
forwardLogProb  Vector in which forward probabilities will be saved. Passing forwardLogProb as an empty vector indicates the start of the sequence (i.e. time t=0). 
double Estimate  (  const arma::mat &  dataSeq, 
arma::mat &  stateProb,  
arma::mat &  forwardProb,  
arma::mat &  backwardProb,  
arma::vec &  scales  
)  const 
Estimate the probabilities of each hidden state at each time step for each given data observation, using the ForwardBackward algorithm.
Each matrix which is returned has columns equal to the number of data observations, and rows equal to the number of hidden states in the model. The loglikelihood of the most probable sequence is returned.
dataSeq  Sequence of observations. 
stateProb  Matrix in which the probabilities of each state at each time interval will be stored. 
forwardProb  Matrix in which the forward probabilities of each state at each time interval will be stored. 
backwardProb  Matrix in which the backward probabilities of each state at each time interval will be stored. 
scales  Vector in which the scaling factors at each time interval will be stored. 
double Estimate  (  const arma::mat &  dataSeq, 
arma::mat &  stateProb  
)  const 
Estimate the probabilities of each hidden state at each time step of each given data observation, using the ForwardBackward algorithm.
The returned matrix of state probabilities has columns equal to the number of data observations, and rows equal to the number of hidden states in the model. The loglikelihood of the most probable sequence is returned.
dataSeq  Sequence of observations. 
stateProb  Probabilities of each state at each time interval. 
void Filter  (  const arma::mat &  dataSeq, 
arma::mat &  filterSeq,  
size_t  ahead = 0 

)  const 
HMM filtering.
Computes the kstepahead expected emission at each time conditioned only on prior observations. That is E{ Y[t+k]  Y[0], ..., Y[t] }. The returned matrix has columns equal to the number of observations. Note that the expectation may not be meaningful for discrete emissions.
dataSeq  Sequence of observations. 
filterSeq  Vector in which the expected emission sequence will be stored. 
ahead  Number of steps ahead (k) for expectations. 

protected 
The Forward algorithm (part of the ForwardBackward algorithm).
Computes forward probabilities for each state for each observation in the given data sequence. The returned matrix has rows equal to the number of hidden states and columns equal to the number of observations.
dataSeq  Data sequence to compute probabilities for. 
logScales  Vector in which the log of scaling factors will be saved. 
forwardLogProb  Matrix in which forward probabilities will be saved. 
Referenced by HMM< mlpack::distribution::DiscreteDistribution >::Tolerance().

protected 
Given emission probabilities, computes forward probabilities at time t=0.
emissionLogProb  Emission probability at time t=0. 
logScales  Vector in which the log of scaling factors will be saved. 
Referenced by HMM< mlpack::distribution::DiscreteDistribution >::Tolerance().

protected 
Given emission probabilities, computes forward probabilities for time t>0.
emissionLogProb  Emission probability at time t>0. 
logScales  Vector in which the log of scaling factors will be saved. 
prevForwardLogProb  Previous forward probabilities. 
Referenced by HMM< mlpack::distribution::DiscreteDistribution >::Tolerance().
void Generate  (  const size_t  length, 
arma::mat &  dataSequence,  
arma::Row< size_t > &  stateSequence,  
const size_t  startState = 0 

)  const 
Generate a random data sequence of the given length.
The data sequence is stored in the dataSequence parameter, and the state sequence is stored in the stateSequence parameter. Each column of dataSequence represents a random observation.
length  Length of random sequence to generate. 
dataSequence  Vector to store data in. 
stateSequence  Vector to store states in. 
startState  Hidden state to start sequence in (default 0). 

inline 

inline 
void load  (  Archive &  ar, 
const uint32_t  version  
) 
Load the object.
Referenced by HMM< mlpack::distribution::DiscreteDistribution >::Tolerance().
double LogEstimate  (  const arma::mat &  dataSeq, 
arma::mat &  stateLogProb,  
arma::mat &  forwardLogProb,  
arma::mat &  backwardLogProb,  
arma::vec &  logScales  
)  const 
Estimate the probabilities of each hidden state at each time step for each given data observation, using the ForwardBackward algorithm.
Each matrix which is returned has columns equal to the number of data observations, and rows equal to the number of hidden states in the model. The loglikelihood of the most probable sequence is returned.
dataSeq  Sequence of observations. 
stateLogProb  Matrix in which the log probabilities of each state at each time interval will be stored. 
forwardLogProb  Matrix in which the forward log probabilities of each state at each time interval will be stored. 
backwardLogProb  Matrix in which the backward log probabilities of each state at each time interval will be stored. 
logScales  Vector in which the log of scaling factors at each time interval will be stored. 
double LogLikelihood  (  const arma::mat &  dataSeq  )  const 
Compute the loglikelihood of the given data sequence.
dataSeq  Data sequence to evaluate the likelihood of. 
double LogLikelihood  (  const arma::vec &  data, 
double &  logLikelihood,  
arma::vec &  forwardLogProb  
)  const 
Compute the loglikelihood of the given data up to time t, storing the result in logLikelihood.
This is meant for incremental or streaming computation of the loglikelihood of a sequence. For the first data point, provide an empty forwardLogProb vector.
data  observation at time t. 
logLikelihood  Loglikelihood of the given sequence of data up to time t1. 
forwardLogProb  Vector in which forward probabilities will be saved. Passing forwardLogProb as an empty vector indicates the start of the sequence (i.e. time t=0). 
double LogScaleFactor  (  const arma::vec &  data, 
arma::vec &  forwardLogProb  
)  const 
Compute the log of the scaling factor of the given data at time t.
To calculate the loglikelihood for the whole sequence, accumulate the log scale factor (the return value of this function) over the entire sequence. This is meant for incremental or streaming computation of the loglikelihood of a sequence. For the first data point, provide an empty forwardLogProb vector.
data  observation at time t. 
forwardLogProb  Vector in which forward probabilities will be saved. Passing forwardLogProb as an empty vector indicates the start of the sequence (i.e. time t=0). 
double Predict  (  const arma::mat &  dataSeq, 
arma::Row< size_t > &  stateSeq  
)  const 
Compute the most probable hidden state sequence for the given data sequence, using the Viterbi algorithm, returning the loglikelihood of the most likely state sequence.
dataSeq  Sequence of observations. 
stateSeq  Vector in which the most probable state sequence will be stored. 
void save  (  Archive &  ar, 
const uint32_t  version  
)  const 
Save the object.
Referenced by HMM< mlpack::distribution::DiscreteDistribution >::Tolerance().
void Smooth  (  const arma::mat &  dataSeq, 
arma::mat &  smoothSeq  
)  const 
HMM smoothing.
Computes expected emission at each time conditioned on all observations. That is E{ Y[t]  Y[0], ..., Y[T] }. The returned matrix has columns equal to the number of observations. Note that the expectation may not be meaningful for discrete emissions.
dataSeq  Sequence of observations. 
smoothSeq  Vector in which the expected emission sequence will be stored. 

inline 

inline 
double Train  (  const std::vector< arma::mat > &  dataSeq  ) 
Train the model using the BaumWelch algorithm, with only the given unlabeled observations.
Instead of giving a guess transition and emission matrix here, do that in the constructor. Each matrix in the vector of data sequences holds an individual data sequence; each point in each individual data sequence should be a column in the matrix. The number of rows in each matrix should be equal to the dimensionality of the HMM (which is set in the constructor).
It is preferable to use the other overload of Train(), with labeled data. That will produce much better results. However, if labeled data is unavailable, this will work. In addition, it is possible to use Train() with labeled data first, and then continue to train the model using this overload of Train() with unlabeled data.
The tolerance of the BaumWelch algorithm can be set either in the constructor or with the Tolerance() method. When the change in loglikelihood of the model between iterations is less than the tolerance, the BaumWelch algorithm terminates.
dataSeq  Vector of observation sequences. 
void Train  (  const std::vector< arma::mat > &  dataSeq, 
const std::vector< arma::Row< size_t > > &  stateSeq  
) 
Train the model using the given labeled observations; the transition and emission matrices are directly estimated.
Each matrix in the vector of data sequences corresponds to a vector in the vector of state sequences. Each point in each individual data sequence should be a column in the matrix, and its state should be the corresponding element in the state sequence vector. For instance, dataSeq[0].col(3) corresponds to the fourth observation in the first data sequence, and its state is stateSeq[0][3]. The number of rows in each matrix should be equal to the dimensionality of the HMM (which is set in the constructor).
dataSeq  Vector of observation sequences. 
stateSeq  Vector of state sequences, corresponding to each observation. 

inline 

inline 

protected 
Set of emission probability distributions; one for each state.
Definition at line 497 of file hmm.hpp.
Referenced by HMM< mlpack::distribution::DiscreteDistribution >::Emission().

mutableprotected 

protected 
A proxy variable in linear space for logTransition.
Should be removed in mlpack 4.0.
Definition at line 503 of file hmm.hpp.
Referenced by HMM< mlpack::distribution::DiscreteDistribution >::Transition().