mlpack  master
CoverTree< MetricType, StatisticType, MatType, RootPointPolicy > Class Template Reference

A cover tree is a tree specifically designed to speed up nearest-neighbor computation in high-dimensional spaces. More...

Classes

class  DualTreeTraverser
 A dual-tree cover tree traverser; see dual_tree_traverser.hpp. More...

 
class  SingleTreeTraverser
 A single-tree cover tree traverser; see single_tree_traverser.hpp for implementation. More...

 

Public Types

template
<
typename
RuleType
>
using BreadthFirstDualTreeTraverser = DualTreeTraverser< RuleType >
 
typedef MatType::elem_type ElemType
 The type held by the matrix type. More...

 
typedef MatType Mat
 So that other classes can access the matrix type. More...

 

Public Member Functions

 CoverTree (const MatType &dataset, const ElemType base=2.0, MetricType *metric=NULL)
 Create the cover tree with the given dataset and given base. More...

 
 CoverTree (const MatType &dataset, MetricType &metric, const ElemType base=2.0)
 Create the cover tree with the given dataset and the given instantiated metric. More...

 
 CoverTree (MatType &&dataset, const ElemType base=2.0)
 Create the cover tree with the given dataset, taking ownership of the dataset. More...

 
 CoverTree (MatType &&dataset, MetricType &metric, const ElemType base=2.0)
 Create the cover tree with the given dataset and the given instantiated metric, taking ownership of the dataset. More...

 
 CoverTree (const MatType &dataset, const ElemType base, const size_t pointIndex, const int scale, CoverTree *parent, const ElemType parentDistance, arma::Col< size_t > &indices, arma::vec &distances, size_t nearSetSize, size_t &farSetSize, size_t &usedSetSize, MetricType &metric=NULL)
 Construct a child cover tree node. More...

 
 CoverTree (const MatType &dataset, const ElemType base, const size_t pointIndex, const int scale, CoverTree *parent, const ElemType parentDistance, const ElemType furthestDescendantDistance, MetricType *metric=NULL)
 Manually construct a cover tree node; no tree assembly is done in this constructor, and children must be added manually (use Children()). More...

 
 CoverTree (const CoverTree &other)
 Create a cover tree from another tree. More...

 
 CoverTree (CoverTree &&other)
 Move constructor for a Cover Tree, possess all the members of the given tree. More...

 
template
<
typename
Archive
>
 CoverTree (Archive &ar, const typename std::enable_if_t< Archive::is_loading::value > *=0)
 Create a cover tree from a boost::serialization archive. More...

 
 ~CoverTree ()
 Delete this cover tree node and its children. More...

 
ElemType Base () const
 Get the base. More...

 
ElemTypeBase ()
 Modify the base; don't do this, you'll break everything. More...

 
void Center (arma::vec &center) const
 Get the center of the node and store it in the given vector. More...

 
const CoverTreeChild (const size_t index) const
 Get a particular child node. More...

 
CoverTreeChild (const size_t index)
 Modify a particular child node. More...

 
CoverTree *& ChildPtr (const size_t index)
 
const std::vector< CoverTree * > & Children () const
 Get the children. More...

 
std::vector< CoverTree * > & Children ()
 Modify the children manually (maybe not a great idea). More...

 
const MatType & Dataset () const
 Get a reference to the dataset. More...

 
size_t Descendant (const size_t index) const
 Get the index of a particular descendant point. More...

 
size_t DistanceComps () const
 
size_t & DistanceComps ()
 
ElemType FurthestDescendantDistance () const
 Get the distance from the center of the node to the furthest descendant. More...

 
ElemTypeFurthestDescendantDistance ()
 Modify the distance from the center of the node to the furthest descendant. More...

 
ElemType FurthestPointDistance () const
 Get the distance to the furthest point. This is always 0 for cover trees. More...

 
template
<
typename
VecType
>
size_t GetFurthestChild (const VecType &point, typename std::enable_if_t< IsVector< VecType >::value > *=0)
 Return the index of the furthest child node to the given query point. More...

 
size_t GetFurthestChild (const CoverTree &queryNode)
 Return the index of the furthest child node to the given query node. More...

 
template
<
typename
VecType
>
size_t GetNearestChild (const VecType &point, typename std::enable_if_t< IsVector< VecType >::value > *=0)
 Return the index of the nearest child node to the given query point. More...

 
size_t GetNearestChild (const CoverTree &queryNode)
 Return the index of the nearest child node to the given query node. More...

 
bool IsLeaf () const
 
ElemType MaxDistance (const CoverTree &other) const
 Return the maximum distance to another node. More...

 
ElemType MaxDistance (const CoverTree &other, const ElemType distance) const
 Return the maximum distance to another node given that the point-to-point distance has already been calculated. More...

 
ElemType MaxDistance (const arma::vec &other) const
 Return the maximum distance to another point. More...

 
ElemType MaxDistance (const arma::vec &other, const ElemType distance) const
 Return the maximum distance to another point given that the distance from the center to the point has already been calculated. More...

 
MetricType & Metric () const
 Get the instantiated metric. More...

 
ElemType MinDistance (const CoverTree &other) const
 Return the minimum distance to another node. More...

 
ElemType MinDistance (const CoverTree &other, const ElemType distance) const
 Return the minimum distance to another node given that the point-to-point distance has already been calculated. More...

 
ElemType MinDistance (const arma::vec &other) const
 Return the minimum distance to another point. More...

 
ElemType MinDistance (const arma::vec &other, const ElemType distance) const
 Return the minimum distance to another point given that the distance from the center to the point has already been calculated. More...

 
ElemType MinimumBoundDistance () const
 Get the minimum distance from the center to any bound edge (this is the same as furthestDescendantDistance). More...

 
size_t NumChildren () const
 Get the number of children. More...

 
size_t NumDescendants () const
 Get the number of descendant points. More...

 
size_t NumPoints () const
 
CoverTreeParent () const
 Get the parent node. More...

 
CoverTree *& Parent ()
 Modify the parent node. More...

 
ElemType ParentDistance () const
 Get the distance to the parent. More...

 
ElemTypeParentDistance ()
 Modify the distance to the parent. More...

 
size_t Point () const
 Get the index of the point which this node represents. More...

 
size_t Point (const size_t) const
 For compatibility with other trees; the argument is ignored. More...

 
math::RangeType< ElemTypeRangeDistance (const CoverTree &other) const
 Return the minimum and maximum distance to another node. More...

 
math::RangeType< ElemTypeRangeDistance (const CoverTree &other, const ElemType distance) const
 Return the minimum and maximum distance to another node given that the point-to-point distance has already been calculated. More...

 
math::RangeType< ElemTypeRangeDistance (const arma::vec &other) const
 Return the minimum and maximum distance to another point. More...

 
math::RangeType< ElemTypeRangeDistance (const arma::vec &other, const ElemType distance) const
 Return the minimum and maximum distance to another point given that the point-to-point distance has already been calculated. More...

 
int Scale () const
 Get the scale of this node. More...

 
int & Scale ()
 Modify the scale of this node. Be careful... More...

 
template
<
typename
Archive
>
void Serialize (Archive &ar, const unsigned int)
 Serialize the tree. More...

 
const StatisticType & Stat () const
 Get the statistic for this node. More...

 
StatisticType & Stat ()
 Modify the statistic for this node. More...

 

Protected Member Functions

 CoverTree ()
 A default constructor. More...

 

Detailed Description


template<typename MetricType = metric::LMetric<2, true>, typename StatisticType = EmptyStatistic, typename MatType = arma::mat, typename RootPointPolicy = FirstPointIsRoot>
class mlpack::tree::CoverTree< MetricType, StatisticType, MatType, RootPointPolicy >

A cover tree is a tree specifically designed to speed up nearest-neighbor computation in high-dimensional spaces.

Each non-leaf node references a point and has a nonzero number of children, including a "self-child" which references the same point. A leaf node represents only one point.

The tree can be thought of as a hierarchy with the root node at the top level and the leaf nodes at the bottom level. Each level in the tree has an assigned 'scale' i. The tree follows these two invariants:

  • nesting: the level C_i is a subset of the level C_{i - 1}.
  • covering: all node in level C_{i - 1} have at least one node in the level C_i with distance less than or equal to b^i (exactly one of these is a parent of the point in level C_{i - 1}.

Note that in the cover tree paper, there is a third invariant (the 'separation invariant'), but that does not apply to our implementation, because we have relaxed the invariant.

The value 'b' refers to the base, which is a parameter of the tree. These three properties make the cover tree very good for fast, high-dimensional nearest-neighbor search.

The theoretical structure of the tree contains many 'implicit' nodes which only have a "self-child" (a child referencing the same point, but at a lower scale level). This practical implementation only constructs explicit nodes – non-leaf nodes with more than one child. A leaf node has no children, and its scale level is INT_MIN.

For more information on cover trees, see

@inproceedings{
author = {Beygelzimer, Alina and Kakade, Sham and Langford, John},
title = {Cover trees for nearest neighbor},
booktitle = {Proceedings of the 23rd International Conference on Machine
Learning},
series = {ICML '06},
year = {2006},
pages = {97--104]
}

For information on runtime bounds of the nearest-neighbor computation using cover trees, see the following paper, presented at NIPS 2009:

@inproceedings{
author = {Ram, P., and Lee, D., and March, W.B., and Gray, A.G.},
title = {Linear-time Algorithms for Pairwise Statistical Problems},
booktitle = {Advances in Neural Information Processing Systems 22},
editor = {Y. Bengio and D. Schuurmans and J. Lafferty and C.K.I. Williams
and A. Culotta},
pages = {1527--1535},
year = {2009}
}

The CoverTree class offers three template parameters; a custom metric type can be used with MetricType (this class defaults to the L2-squared metric). The root node's point can be chosen with the RootPointPolicy; by default, the FirstPointIsRoot policy is used, meaning the first point in the dataset is used. The StatisticType policy allows you to define statistics which can be gathered during the creation of the tree.

Template Parameters
MetricTypeMetric type to use during tree construction.
RootPointPolicyDetermines which point to use as the root node.
StatisticTypeStatistic to be used during tree creation.
MatTypeType of matrix to build the tree on (generally mat or sp_mat).

Definition at line 99 of file cover_tree.hpp.

Member Typedef Documentation

◆ BreadthFirstDualTreeTraverser

Definition at line 264 of file cover_tree.hpp.

◆ ElemType

typedef MatType::elem_type ElemType

The type held by the matrix type.

Definition at line 105 of file cover_tree.hpp.

◆ Mat

typedef MatType Mat

So that other classes can access the matrix type.

Definition at line 103 of file cover_tree.hpp.

Constructor & Destructor Documentation

◆ CoverTree() [1/10]

CoverTree ( const MatType &  dataset,
const ElemType  base = 2.0,
MetricType *  metric = NULL 
)

Create the cover tree with the given dataset and given base.

The dataset will not be modified during the building procedure (unlike BinarySpaceTree).

The last argument will be removed in mlpack 1.1.0 (see #274 and #273).

Parameters
datasetReference to the dataset to build a tree on.
baseBase to use during tree building (default 2.0).

◆ CoverTree() [2/10]

CoverTree ( const MatType &  dataset,
MetricType &  metric,
const ElemType  base = 2.0 
)

Create the cover tree with the given dataset and the given instantiated metric.

Optionally, set the base. The dataset will not be modified during the building procedure (unlike BinarySpaceTree).

Parameters
datasetReference to the dataset to build a tree on.
metricInstantiated metric to use during tree building.
baseBase to use during tree building (default 2.0).

◆ CoverTree() [3/10]

CoverTree ( MatType &&  dataset,
const ElemType  base = 2.0 
)

Create the cover tree with the given dataset, taking ownership of the dataset.

Optionally, set the base.

Parameters
datasetReference to the dataset to build a tree on.
baseBase to use during tree building (default 2.0).

◆ CoverTree() [4/10]

CoverTree ( MatType &&  dataset,
MetricType &  metric,
const ElemType  base = 2.0 
)

Create the cover tree with the given dataset and the given instantiated metric, taking ownership of the dataset.

Optionally, set the base.

Parameters
datasetReference to the dataset to build a tree on.
metricInstantiated metric to use during tree building.
baseBase to use during tree building (default 2.0).

◆ CoverTree() [5/10]

CoverTree ( const MatType &  dataset,
const ElemType  base,
const size_t  pointIndex,
const int  scale,
CoverTree< MetricType, StatisticType, MatType, RootPointPolicy > *  parent,
const ElemType  parentDistance,
arma::Col< size_t > &  indices,
arma::vec &  distances,
size_t  nearSetSize,
size_t &  farSetSize,
size_t &  usedSetSize,
MetricType &  metric = NULL 
)

Construct a child cover tree node.

This constructor is not meant to be used externally, but it could be used to insert another node into a tree. This procedure uses only one vector for the near set, the far set, and the used set (this is to prevent unnecessary memory allocation in recursive calls to this constructor). Therefore, the size of the near set, far set, and used set must be passed in. The near set will be entirely used up, and some of the far set may be used. The value of usedSetSize will be set to the number of points used in the construction of this node, and the value of farSetSize will be modified to reflect the number of points in the far set after the construction of this node.

If you are calling this manually, be careful that the given scale is as small as possible, or you may be creating an implicit node in your tree.

Parameters
datasetReference to the dataset to build a tree on.
baseBase to use during tree building.
pointIndexIndex of the point this node references.
scaleScale of this level in the tree.
parentParent of this node (NULL indicates no parent).
parentDistanceDistance to the parent node.
indicesArray of indices, ordered [ nearSet | farSet | usedSet ]; will be modified to [ farSet | usedSet ].
distancesArray of distances, ordered the same way as the indices. These represent the distances between the point specified by pointIndex and each point in the indices array.
nearSetSizeSize of the near set; if 0, this will be a leaf.
farSetSizeSize of the far set; may be modified (if this node uses any points in the far set).
usedSetSizeThe number of points used will be added to this number.

◆ CoverTree() [6/10]

CoverTree ( const MatType &  dataset,
const ElemType  base,
const size_t  pointIndex,
const int  scale,
CoverTree< MetricType, StatisticType, MatType, RootPointPolicy > *  parent,
const ElemType  parentDistance,
const ElemType  furthestDescendantDistance,
MetricType *  metric = NULL 
)

Manually construct a cover tree node; no tree assembly is done in this constructor, and children must be added manually (use Children()).

This constructor is useful when the tree is being "imported" into the CoverTree class after being created in some other manner.

Parameters
datasetReference to the dataset this node is a part of.
baseBase that was used for tree building.
pointIndexIndex of the point in the dataset which this node refers to.
scaleScale of this node's level in the tree.
parentParent node (NULL indicates no parent).
parentDistanceDistance to parent node point.
furthestDescendantDistanceDistance to furthest descendant point.
metricInstantiated metric (optional).

◆ CoverTree() [7/10]

CoverTree ( const CoverTree< MetricType, StatisticType, MatType, RootPointPolicy > &  other)

Create a cover tree from another tree.

Be careful! This may use a lot of memory and take a lot of time. This will also make a copy of the dataset.

Parameters
otherCover tree to copy from.

◆ CoverTree() [8/10]

CoverTree ( CoverTree< MetricType, StatisticType, MatType, RootPointPolicy > &&  other)

Move constructor for a Cover Tree, possess all the members of the given tree.

Parameters
otherCover Tree to move.

◆ CoverTree() [9/10]

CoverTree ( Archive &  ar,
const typename std::enable_if_t< Archive::is_loading::value > *  = 0 
)

Create a cover tree from a boost::serialization archive.

◆ ~CoverTree()

~CoverTree ( )

Delete this cover tree node and its children.

◆ CoverTree() [10/10]

CoverTree ( )
protected

A default constructor.

This is meant to only be used with boost::serialization, which is allowed with the friend declaration below. This does not return a valid tree! This method must be protected, so that the serialization shim can work with the default constructor.

Referenced by CoverTree< MetricType, StatisticType, MatType, RootPointPolicy >::Metric().

Member Function Documentation

◆ Base() [1/2]

ElemType Base ( ) const
inline

Get the base.

Definition at line 304 of file cover_tree.hpp.

◆ Base() [2/2]

ElemType& Base ( )
inline

Modify the base; don't do this, you'll break everything.

Definition at line 306 of file cover_tree.hpp.

◆ Center()

void Center ( arma::vec &  center) const
inline

Get the center of the node and store it in the given vector.

Definition at line 412 of file cover_tree.hpp.

◆ Child() [1/2]

const CoverTree& Child ( const size_t  index) const
inline

Get a particular child node.

Definition at line 278 of file cover_tree.hpp.

◆ Child() [2/2]

CoverTree& Child ( const size_t  index)
inline

Modify a particular child node.

Definition at line 280 of file cover_tree.hpp.

◆ ChildPtr()

CoverTree*& ChildPtr ( const size_t  index)
inline

Definition at line 282 of file cover_tree.hpp.

◆ Children() [1/2]

const std::vector<CoverTree*>& Children ( ) const
inline

Get the children.

Definition at line 288 of file cover_tree.hpp.

◆ Children() [2/2]

std::vector<CoverTree*>& Children ( )
inline

◆ Dataset()

const MatType& Dataset ( ) const
inline

Get a reference to the dataset.

Definition at line 267 of file cover_tree.hpp.

◆ Descendant()

size_t Descendant ( const size_t  index) const

Get the index of a particular descendant point.

Referenced by CoverTree< MetricType, StatisticType, MatType, RootPointPolicy >::Children().

◆ DistanceComps() [1/2]

size_t DistanceComps ( ) const
inline

Definition at line 555 of file cover_tree.hpp.

◆ DistanceComps() [2/2]

size_t& DistanceComps ( )
inline

Definition at line 556 of file cover_tree.hpp.

◆ FurthestDescendantDistance() [1/2]

ElemType FurthestDescendantDistance ( ) const
inline

Get the distance from the center of the node to the furthest descendant.

Definition at line 401 of file cover_tree.hpp.

◆ FurthestDescendantDistance() [2/2]

ElemType& FurthestDescendantDistance ( )
inline

Modify the distance from the center of the node to the furthest descendant.

Definition at line 405 of file cover_tree.hpp.

◆ FurthestPointDistance()

ElemType FurthestPointDistance ( ) const
inline

Get the distance to the furthest point. This is always 0 for cover trees.

Definition at line 398 of file cover_tree.hpp.

◆ GetFurthestChild() [1/2]

size_t GetFurthestChild ( const VecType &  point,
typename std::enable_if_t< IsVector< VecType >::value > *  = 0 
)

Return the index of the furthest child node to the given query point.

If this is a leaf node, it will return NumChildren() (invalid index).

Referenced by CoverTree< MetricType, StatisticType, MatType, RootPointPolicy >::Stat().

◆ GetFurthestChild() [2/2]

size_t GetFurthestChild ( const CoverTree< MetricType, StatisticType, MatType, RootPointPolicy > &  queryNode)

Return the index of the furthest child node to the given query node.

If it can't decide, it will return NumChildren() (invalid index).

◆ GetNearestChild() [1/2]

size_t GetNearestChild ( const VecType &  point,
typename std::enable_if_t< IsVector< VecType >::value > *  = 0 
)

Return the index of the nearest child node to the given query point.

If this is a leaf node, it will return NumChildren() (invalid index).

Referenced by CoverTree< MetricType, StatisticType, MatType, RootPointPolicy >::Stat().

◆ GetNearestChild() [2/2]

size_t GetNearestChild ( const CoverTree< MetricType, StatisticType, MatType, RootPointPolicy > &  queryNode)

Return the index of the nearest child node to the given query node.

If it can't decide, it will return NumChildren() (invalid index).

◆ IsLeaf()

bool IsLeaf ( ) const
inline

Definition at line 274 of file cover_tree.hpp.

◆ MaxDistance() [1/4]

ElemType MaxDistance ( const CoverTree< MetricType, StatisticType, MatType, RootPointPolicy > &  other) const

Return the maximum distance to another node.

Referenced by CoverTree< MetricType, StatisticType, MatType, RootPointPolicy >::Stat().

◆ MaxDistance() [2/4]

ElemType MaxDistance ( const CoverTree< MetricType, StatisticType, MatType, RootPointPolicy > &  other,
const ElemType  distance 
) const

Return the maximum distance to another node given that the point-to-point distance has already been calculated.

◆ MaxDistance() [3/4]

ElemType MaxDistance ( const arma::vec &  other) const

Return the maximum distance to another point.

◆ MaxDistance() [4/4]

ElemType MaxDistance ( const arma::vec &  other,
const ElemType  distance 
) const

Return the maximum distance to another point given that the distance from the center to the point has already been calculated.

◆ Metric()

MetricType& Metric ( ) const
inline

Get the instantiated metric.

Definition at line 418 of file cover_tree.hpp.

References CoverTree< MetricType, StatisticType, MatType, RootPointPolicy >::CoverTree().

◆ MinDistance() [1/4]

ElemType MinDistance ( const CoverTree< MetricType, StatisticType, MatType, RootPointPolicy > &  other) const

Return the minimum distance to another node.

Referenced by CoverTree< MetricType, StatisticType, MatType, RootPointPolicy >::Stat().

◆ MinDistance() [2/4]

ElemType MinDistance ( const CoverTree< MetricType, StatisticType, MatType, RootPointPolicy > &  other,
const ElemType  distance 
) const

Return the minimum distance to another node given that the point-to-point distance has already been calculated.

◆ MinDistance() [3/4]

ElemType MinDistance ( const arma::vec &  other) const

Return the minimum distance to another point.

◆ MinDistance() [4/4]

ElemType MinDistance ( const arma::vec &  other,
const ElemType  distance 
) const

Return the minimum distance to another point given that the distance from the center to the point has already been calculated.

◆ MinimumBoundDistance()

ElemType MinimumBoundDistance ( ) const
inline

Get the minimum distance from the center to any bound edge (this is the same as furthestDescendantDistance).

Definition at line 409 of file cover_tree.hpp.

◆ NumChildren()

size_t NumChildren ( ) const
inline

Get the number of children.

Definition at line 285 of file cover_tree.hpp.

◆ NumDescendants()

size_t NumDescendants ( ) const

Get the number of descendant points.

Referenced by CoverTree< MetricType, StatisticType, MatType, RootPointPolicy >::Children().

◆ NumPoints()

size_t NumPoints ( ) const
inline

Definition at line 275 of file cover_tree.hpp.

◆ Parent() [1/2]

CoverTree* Parent ( ) const
inline

Get the parent node.

Definition at line 388 of file cover_tree.hpp.

◆ Parent() [2/2]

CoverTree*& Parent ( )
inline

Modify the parent node.

Definition at line 390 of file cover_tree.hpp.

◆ ParentDistance() [1/2]

ElemType ParentDistance ( ) const
inline

Get the distance to the parent.

Definition at line 393 of file cover_tree.hpp.

◆ ParentDistance() [2/2]

ElemType& ParentDistance ( )
inline

Modify the distance to the parent.

Definition at line 395 of file cover_tree.hpp.

◆ Point() [1/2]

size_t Point ( ) const
inline

Get the index of the point which this node represents.

Definition at line 270 of file cover_tree.hpp.

◆ Point() [2/2]

size_t Point ( const size_t  ) const
inline

For compatibility with other trees; the argument is ignored.

Definition at line 272 of file cover_tree.hpp.

◆ RangeDistance() [1/4]

math::RangeType<ElemType> RangeDistance ( const CoverTree< MetricType, StatisticType, MatType, RootPointPolicy > &  other) const

Return the minimum and maximum distance to another node.

Referenced by CoverTree< MetricType, StatisticType, MatType, RootPointPolicy >::Stat().

◆ RangeDistance() [2/4]

math::RangeType<ElemType> RangeDistance ( const CoverTree< MetricType, StatisticType, MatType, RootPointPolicy > &  other,
const ElemType  distance 
) const

Return the minimum and maximum distance to another node given that the point-to-point distance has already been calculated.

◆ RangeDistance() [3/4]

math::RangeType<ElemType> RangeDistance ( const arma::vec &  other) const

Return the minimum and maximum distance to another point.

◆ RangeDistance() [4/4]

math::RangeType<ElemType> RangeDistance ( const arma::vec &  other,
const ElemType  distance 
) const

Return the minimum and maximum distance to another point given that the point-to-point distance has already been calculated.

◆ Scale() [1/2]

int Scale ( ) const
inline

Get the scale of this node.

Definition at line 299 of file cover_tree.hpp.

◆ Scale() [2/2]

int& Scale ( )
inline

Modify the scale of this node. Be careful...

Definition at line 301 of file cover_tree.hpp.

◆ Serialize()

void Serialize ( Archive &  ar,
const unsigned  int 
)

Serialize the tree.

◆ Stat() [1/2]

const StatisticType& Stat ( ) const
inline

Get the statistic for this node.

Definition at line 309 of file cover_tree.hpp.

◆ Stat() [2/2]


The documentation for this class was generated from the following file: