[mlpack] Parameters for DBSCAN

Ryan Curtin ryan at ratml.org
Sun Dec 30 23:51:34 EST 2018


On Sun, Dec 30, 2018 at 11:42:54PM -0500, Tharindu Mathew wrote:
> Hi,
> 
> Would someone kindly point me to what range and point selection types are
> available for dbscan? Going through header or API docs didn't help in this
> case.

Hey there Tharindu,

I dug into it a little bit after seeing your IRC messages.  You can see
my responses here:

http://mlpack.org/irc/mlpack.20181230.html

Actually I ended up noticing that PointSelectionPolicy isn't even used,
so I opened an issue about it:

https://github.com/mlpack/mlpack/issues/1625

For the RangeSearchType, that is more there in case someone wants to use
a different algorithm for the range search portion of the algorithm.
Right now mlpack::range::RangeSearch is the only option implemented in
the library.  The results it will give are exact, so the only thing that
would make a difference is if you, e.g., wanted to use RangeSearch with
a different metric (RangeSearch itself has some template parameters), or
replace it with a different algorithm that might run faster in your case
or something (but like I said you'd need to hand-implement it, and it
would just need to have a Search() function matching the API of
RangeSearch).

The biggest parameters that will make a difference for the DBSCAN
clustering are epsilon and minPoints.  You might also consider trying
some of the other clustering algorithms in mlpack, like MeanShift and
KMeans.

Hope this helps!

-- 
Ryan Curtin    | "This room is green."
ryan at ratml.org |   - Kazan


More information about the mlpack mailing list