The Gini gain, a measure of set purity usable as a fitness function (FitnessFunction) for decision trees. More...

Static Public Member Functions

template<bool UseWeights, typename RowType , typename WeightVecType >
static double Evaluate (const RowType &labels, const size_t numClasses, const WeightVecType &weights)
 Evaluate the Gini impurity on the given set of labels. More...

 
template<bool UseWeights, typename CountType >
static double EvaluatePtr (const CountType *counts, const size_t countLength, const CountType totalCount)
 Evaluate the Gini impurity given a vector of class weight counts. More...

 
static double Range (const size_t numClasses)
 Return the range of the Gini impurity for the given number of classes. More...

 

Detailed Description

The Gini gain, a measure of set purity usable as a fitness function (FitnessFunction) for decision trees.

This is the exact same thing as the well-known Gini impurity, but negated—since the decision tree will be trying to maximize gain (and the Gini impurity would need to be minimized).

Definition at line 27 of file gini_gain.hpp.

Member Function Documentation

◆ Evaluate()

static double Evaluate ( const RowType &  labels,
const size_t  numClasses,
const WeightVecType &  weights 
)
inlinestatic

Evaluate the Gini impurity on the given set of labels.

RowType should be an Armadillo vector that holds size_t objects.

Note that it is possible that due to floating-point representation issues, it is possible that the gain returned can be very slightly greater than 0! Thus, if you are checking for a perfect fit, be sure to use 'gain >= 0.0' not 'gain == 0.0'.

Parameters
labelsSet of labels to evaluate Gini impurity on.
numClassesNumber of classes in the dataset.
weightsWeight of labels.

Definition at line 62 of file gini_gain.hpp.

◆ EvaluatePtr()

static double EvaluatePtr ( const CountType *  counts,
const size_t  countLength,
const CountType  totalCount 
)
inlinestatic

Evaluate the Gini impurity given a vector of class weight counts.

Definition at line 34 of file gini_gain.hpp.

◆ Range()

static double Range ( const size_t  numClasses)
inlinestatic

Return the range of the Gini impurity for the given number of classes.

(That is, the difference between the maximum possible value and the minimum possible value.)

Parameters
numClassesNumber of classes in the dataset.

Definition at line 203 of file gini_gain.hpp.


The documentation for this class was generated from the following file:
  • /home/jenkins-mlpack/mlpack.org/_src/mlpack-git/src/mlpack/methods/decision_tree/gini_gain.hpp