[mlpack] CMakeList adjustments for compiling Mlpack with Armadillo using openblas

Steenwijk, Martijn m.steenwijk at vumc.nl
Sat Dec 28 15:30:34 EST 2013


I'll look into allkrann more closely, thanks.

Considering ANN, I looked back into my code and ANNs manual, and it seems that it supports both bd trees as well as kd trees (I'm using kd trees). However, the splitting rule may be different compared to other implementations? 

> Op 28 dec. 2013 om 21:22 heeft "Ryan Curtin" <gth671b at mail.gatech.edu> het volgende geschreven:
> 
>> On Sat, Dec 28, 2013 at 08:06:32PM +0000, Steenwijk, Martijn wrote:
>> Thanks again for your response :)
>> 
>> @benchmarks: thanks, that looks pretty impressive. Marcus did a pretty
>> damn cool job on the benchmarking system. :-) Would be really helpful
>> to have widely used libraries such as ANN and FLANN in there, but I'm
>> not sure whether he has still time after this... The problem (or
>> stated differently: "challenge") with my data is always the amount of
>> points. There is no comparable standard dataset of this size... 
>> 
>> Oh before I forget, I use ANN with "exact" precision.
> 
> Ah, ok.  Well that's less exciting, but still good to hear mlpack is at
> least keeping up.
> 
> Comparing against ANN or FLANN is somewhat difficult because what we're
> trying to compare is specific implementations and not necessarily
> different algorithms (Marcus, feel free to correct me if you have
> different ideas).  So because we don't implement the BDD-tree,
> comparison with ANN isn't just an implementation comparison.  At the
> same time, no other libraries (to my knowledge) implement dual-tree
> nearest-neighbor search so the comparison is already not just
> implementation.  If I have some time I'll see if I can add tests for ANN
> and FLANN to the existing benchmarks.
> 
>> @allkrann: that's another possibility, although my application
>> normally requires exact (to very low error) accuracy. Anyway, thanks
>> again, I'll try some things and let you know how they worked out.
> 
> The idea behind rank approximation is interesting; instead of providing
> a nearest neighbor with distance within 5% of the nearest neighbor, rank
> approximation guarantees (probabilistically) that the returned neighbors
> are in the top N% of results.  So for instance, if you set have a
> dataset with 10000 points and set k = 5 (return
> 5 neighbors), a desired success probability of 0.95, and a rank error of
> 0.1%, then with probability 0.95, each of your 5 returned neighbors will
> be one of the top 10 neighbors (the 0.1 percentile of 10000 points is 10
> points).
> 
> Here's a link to the paper:
> 
>  http://machinelearning.wustl.edu/mlpapers/paper_files/NIPS2009_0435.pdf
> 
> It's not a particularly commonly used idea, so mlpack has the only
> implementation of it.
> 
> -- 
> Ryan Curtin    | "Hungry."
> ryan at ratml.org |   - Sphinx
> _______________________________________________
> mlpack mailing list
> mlpack at cc.gatech.edu
> https://mailman.cc.gatech.edu/mailman/listinfo/mlpack



More information about the mlpack mailing list