ensmallen
mlpack
fast, flexible C++ machine learning library
DecisionStump< MatType > Class Template Reference

This class implements a decision stump. More...

Public Member Functions

 DecisionStump (const MatType &data, const arma::Row< size_t > &labels, const size_t numClasses, const size_t bucketSize=10)
 Constructor. More...

 
 DecisionStump (const DecisionStump<> &other, const MatType &data, const arma::Row< size_t > &labels, const size_t numClasses, const arma::rowvec &weights)
 Alternate constructor which copies the parameters bucketSize and classes from an already initiated decision stump, other. More...

 
 DecisionStump ()
 Create a decision stump without training. More...

 
const arma::Col< size_t > BinLabels () const
 Access the labels for each split bin. More...

 
arma::Col< size_t > & BinLabels ()
 Modify the labels for each split bin (be careful!). More...

 
void Classify (const MatType &test, arma::Row< size_t > &predictedLabels)
 Classification function. More...

 
template
<
typename
Archive
>
void serialize (Archive &ar, const unsigned int)
 Serialize the decision stump. More...

 
const arma::vec & Split () const
 Access the splitting values. More...

 
arma::vec & Split ()
 Modify the splitting values (be careful!). More...

 
size_t SplitDimension () const
 Access the splitting dimension. More...

 
size_t & SplitDimension ()
 Modify the splitting dimension (be careful!). More...

 
double Train (const MatType &data, const arma::Row< size_t > &labels, const size_t numClasses, const size_t bucketSize)
 Train the decision stump on the given data. More...

 
double Train (const MatType &data, const arma::Row< size_t > &labels, const arma::rowvec &weights, const size_t numClasses, const size_t bucketSize)
 Train the decision stump on the given data, with the given weights. More...

 

Detailed Description


template
<
typename
MatType
=
arma::mat
>

class mlpack::decision_stump::DecisionStump< MatType >

This class implements a decision stump.

It constructs a single level decision tree, i.e., a decision stump. It uses entropy to decide splitting ranges.

The stump is parameterized by a splitting dimension (the dimension on which points are split), a vector of bin split values, and a vector of labels for each bin. Bin i is specified by the range [split[i], split[i + 1]). The last bin has range up to (split[i + 1] does not exist in that case). Points that are below the first bin will take the label of the first bin.

Template Parameters
MatTypeType of matrix that is being used (sparse or dense).

Definition at line 34 of file decision_stump.hpp.

Constructor & Destructor Documentation

◆ DecisionStump() [1/3]

DecisionStump ( const MatType &  data,
const arma::Row< size_t > &  labels,
const size_t  numClasses,
const size_t  bucketSize = 10 
)

Constructor.

Train on the provided data. Generate a decision stump from data.

Parameters
dataInput, training data.
labelsLabels of training data.
numClassesNumber of distinct classes in labels.
bucketSizeMinimum size of bucket when splitting.

◆ DecisionStump() [2/3]

DecisionStump ( const DecisionStump<> &  other,
const MatType &  data,
const arma::Row< size_t > &  labels,
const size_t  numClasses,
const arma::rowvec &  weights 
)

Alternate constructor which copies the parameters bucketSize and classes from an already initiated decision stump, other.

It appropriately sets the weight vector.

Parameters
otherThe other initiated Decision Stump object from which we copy the values.
dataThe data on which to train this object on.
labelsThe labels of data.
weightsWeight vector to use while training. For boosting purposes.

◆ DecisionStump() [3/3]

Create a decision stump without training.

This stump will not be useful and will always return a class of 0 for anything that is to be classified, so it would be a prudent idea to call Train() after using this constructor.

Member Function Documentation

◆ BinLabels() [1/2]

const arma::Col<size_t> BinLabels ( ) const
inline

Access the labels for each split bin.

Definition at line 130 of file decision_stump.hpp.

◆ BinLabels() [2/2]

arma::Col<size_t>& BinLabels ( )
inline

Modify the labels for each split bin (be careful!).

Definition at line 132 of file decision_stump.hpp.

References DecisionStump< MatType >::serialize(), and DecisionStump< MatType >::Train().

◆ Classify()

void Classify ( const MatType &  test,
arma::Row< size_t > &  predictedLabels 
)

Classification function.

After training, classify test, and put the predicted classes in predictedLabels.

Parameters
testTesting data or data to classify.
predictedLabelsVector to store the predicted classes after classifying test data.

◆ serialize()

void serialize ( Archive &  ar,
const unsigned  int 
)

Serialize the decision stump.

Referenced by DecisionStump< MatType >::BinLabels().

◆ Split() [1/2]

const arma::vec& Split ( ) const
inline

Access the splitting values.

Definition at line 125 of file decision_stump.hpp.

◆ Split() [2/2]

arma::vec& Split ( )
inline

Modify the splitting values (be careful!).

Definition at line 127 of file decision_stump.hpp.

◆ SplitDimension() [1/2]

size_t SplitDimension ( ) const
inline

Access the splitting dimension.

Definition at line 120 of file decision_stump.hpp.

◆ SplitDimension() [2/2]

size_t& SplitDimension ( )
inline

Modify the splitting dimension (be careful!).

Definition at line 122 of file decision_stump.hpp.

◆ Train() [1/2]

double Train ( const MatType &  data,
const arma::Row< size_t > &  labels,
const size_t  numClasses,
const size_t  bucketSize 
)

Train the decision stump on the given data.

This completely overwrites any previous training data, so after training the stump may be completely different.

Parameters
dataDataset to train on.
labelsLabels for each point in the dataset.
numClassesNumber of classes in the dataset.
bucketSizeMinimum size of bucket when splitting.
Returns
The final entropy after splitting.

Referenced by DecisionStump< MatType >::BinLabels().

◆ Train() [2/2]

double Train ( const MatType &  data,
const arma::Row< size_t > &  labels,
const arma::rowvec &  weights,
const size_t  numClasses,
const size_t  bucketSize 
)

Train the decision stump on the given data, with the given weights.

This completely overwrites any previous training data, so after training the stump may be completely different.

Parameters
dataDataset to train on.
labelsLabels for each point in the dataset.
weightsWeights for each point in the dataset.
numClassesNumber of classes in the dataset.
bucketSizeMinimum size of bucket when splitting.
Returns
The final entropy after splitting.

The documentation for this class was generated from the following file:
  • /home/ryan/src/mlpack.org/_src/mlpack-git/src/mlpack/methods/decision_stump/decision_stump.hpp