ensmallen
mlpack
fast, flexible C++ machine learning library
DBSCAN< RangeSearchType, PointSelectionPolicy > Class Template Reference

DBSCAN (Density-Based Spatial Clustering of Applications with Noise) is a clustering technique described in the following paper: More...

Public Member Functions

 DBSCAN (const double epsilon, const size_t minPoints, const bool batchMode=true, RangeSearchType rangeSearch=RangeSearchType(), PointSelectionPolicy pointSelector=PointSelectionPolicy())
 Construct the DBSCAN object with the given parameters. More...

 
template
<
typename
MatType
>
size_t Cluster (const MatType &data, arma::mat &centroids)
 Performs DBSCAN clustering on the data, returning number of clusters and also the centroid of each cluster. More...

 
template
<
typename
MatType
>
size_t Cluster (const MatType &data, arma::Row< size_t > &assignments)
 Performs DBSCAN clustering on the data, returning number of clusters and also the list of cluster assignments. More...

 
template
<
typename
MatType
>
size_t Cluster (const MatType &data, arma::Row< size_t > &assignments, arma::mat &centroids)
 Performs DBSCAN clustering on the data, returning number of clusters, the centroid of each cluster and also the list of cluster assignments. More...

 

Detailed Description


template<typename RangeSearchType = range::RangeSearch<>, typename PointSelectionPolicy = OrderedPointSelection>
class mlpack::dbscan::DBSCAN< RangeSearchType, PointSelectionPolicy >

DBSCAN (Density-Based Spatial Clustering of Applications with Noise) is a clustering technique described in the following paper:

@inproceedings{ester1996density,
title={A density-based algorithm for discovering clusters in large spatial
databases with noise.},
author={Ester, M. and Kriegel, H.-P. and Sander, J. and Xu, X.},
booktitle={Proceedings of the Second International Conference on Knowledge
Discovery and Data Mining (KDD '96)},
pages={226--231},
year={1996}
}

The DBSCAN algorithm iteratively clusters points using range searches with a specified radius parameter. This implementation allows configuration of the range search technique used and the point selection strategy by means of template parameters.

Template Parameters
RangeSearchTypeClass to use for range searching.
PointSelectionPolicyStrategy for selecting next point to cluster with.

Definition at line 53 of file dbscan.hpp.

Constructor & Destructor Documentation

◆ DBSCAN()

DBSCAN ( const double  epsilon,
const size_t  minPoints,
const bool  batchMode = true,
RangeSearchType  rangeSearch = RangeSearchType(),
PointSelectionPolicy  pointSelector = PointSelectionPolicy() 
)

Construct the DBSCAN object with the given parameters.

The batchMode parameter should be set to false in the case where RAM issues will be encountered (i.e. if the dataset is very large or if epsilon is large). When batchMode is false, each point will be searched iteratively, which could be slower but will use less memory.

Parameters
epsilonSize of range query.
minPointsMinimum number of points for each cluster.
batchModeIf true, all points are searched in batch.
rangeSearchOptional instantiated RangeSearch object.
pointSelectorOptionL instantiated PointSelectionPolicy object.

Member Function Documentation

◆ Cluster() [1/3]

size_t Cluster ( const MatType &  data,
arma::mat &  centroids 
)

Performs DBSCAN clustering on the data, returning number of clusters and also the centroid of each cluster.

Template Parameters
MatTypeType of matrix (arma::mat or arma::sp_mat).
Parameters
dataDataset to cluster.
centroidsMatrix in which centroids are stored.

◆ Cluster() [2/3]

size_t Cluster ( const MatType &  data,
arma::Row< size_t > &  assignments 
)

Performs DBSCAN clustering on the data, returning number of clusters and also the list of cluster assignments.

If assignments[i] == SIZE_MAX, then the point is considered "noise".

Template Parameters
MatTypeType of matrix (arma::mat or arma::sp_mat).
Parameters
dataDataset to cluster.
assignmentsVector to store cluster assignments.

◆ Cluster() [3/3]

size_t Cluster ( const MatType &  data,
arma::Row< size_t > &  assignments,
arma::mat &  centroids 
)

Performs DBSCAN clustering on the data, returning number of clusters, the centroid of each cluster and also the list of cluster assignments.

If assignments[i] == SIZE_MAX, then the point is considered "noise".

Template Parameters
MatTypeType of matrix (arma::mat or arma::sp_mat).
Parameters
dataDataset to cluster.
assignmentsVector to store cluster assignments.
centroidsMatrix in which centroids are stored.

The documentation for this class was generated from the following file:
  • /home/ryan/src/mlpack.org/_src/mlpack-git/src/mlpack/methods/dbscan/dbscan.hpp