[mlpack] Automatic Benchmarking

Fri Feb 28 16:38:28 EST 2014

Hello,

Thank you for your interest in the project. I'm sorry for the slow response.

On 28 Feb 2014, at 21:55, Ryan Curtin <gth671b at mail.gatech.edu> wrote:

> 
> Hi Praveen,
> 
> Thank you for the link to the WiseRF post.  Benchmarking in and of
> itself is a very difficult task, especially with respect to getting
> "unbiased" results.  In reality, no result is unbiased, and it's never
> possible to say "this algorithm is better!" because almost certainly
> there is some dataset for which it isn't.
> 
> What the benchmarking system currently provides is a way to get a quick
> idea of which implementation of an algorithm runs most quickly, but this
> is aimed at answering one question:
> 
>  "How fast is an mlpack implementation of an algorithm compared to
>   other implementations?"
> 
> But there are lots of possible questions we can aim to answer.  Here are
> a few (there are lots of other possibilities):
> 
>  "How quickly does this approximate algorithm converge to a result
>   compared to other implementations of the same approximate algorithm?"
> 
>  "How quickly does this approximate algorithm converge to a result
>   compared to other implementations of different approximate
>   algorithms?"
> 
>  "How accurate is this classifier compared to other classifiers,
>   regardless of runtime?"
> 
>  "If I specify how much time I have, which algorithm gives the most
>   accurate result in that time frame?”

Here are some more questions:

"Which implementation requires the least amount of memory?"

"Which implementation requires the longest time to train/test a model (Distinguish between training and evaluation time)?"

"Which algorithm achieves the best results over all datasets in a particular field?”

> 
> None of these questions are particularly difficult to answer for a given
> set of algorithms, but often the hard part is finding a good way to
> visualize this data (especially that last question is very hard).

Right, the visualization is the tricky part. Keep in mind we are not mad if someone makes a complete redesign :).

> 
> I think a good aim for this project is to expand the scope of the
> benchmarking system to be able to answer one of those other questions.
> Choosing parameters is another degree of freedom in the question to be
> answered -- it could be "compare these algorithms with default
> parameters" or "cross-validate the parameters and compare the _best_
> results of the algorithms".
> 
> I think I have proposed lots of vague ideas instead of giving a definite
> answer, so I am sorry about that.  :)
> 
> The project is quite open-ended and as a result it is up to the student
> to come up with some interesting ideas they would like to see
> implemented.  Marcus can correct me if he wanted to see the project go a
> particular way, because after all the benchmarking system as a whole is
> his work.  :)

Additionaly a potential student can improve the base system, for example distribute the benchmark jobs on different machines, to reduce the benchmarking time, restrict the CPU resources, etc. At the end it's an open end project so any improvement that's made to the system will improve it in some way :)

> I also think it's important to have an open-ended discussion about this
> project in a public place, so if anyone else out there has ideas or
> opinions about how this should work, please chime in!
> 

If you have any ideas, I'd like to hear about it.

Thanks,
Marcus

> On Fri, Feb 28, 2014 at 10:09:12AM +0005, Praveen wrote:
>> Hello, 
>> I am Praveen Venkateswaran, an undergraduate doing Computer Science
>> and Mathematics in India. 
>> I have worked with various machine learning algorithms as well as on
>> information retrieval and I would love to be able to contribute to
>> mlpack starting with GSoC 2014. 
>> 
>> I am interested in working on the improvement to the automatic
>> benchmarking that was done during the last summer. I would like to
>> start in terms of comparing the accuracy of implementations of
>> various libraries. I've been browsing resources to try and find some
>> starting point for this. [0] basically describes WiseRF's
>> benchmarking of the random forest classification. 
>> 
>> The point that strikes me the most is that they tried whenever
>> possible to see which parameters yielded better results on
>> individual libraries and then compared the libraries based on those
>> individualistic parameters instead of just using default parameters
>> which i totally agree with as it yields more unbiased results, what
>> do you think about this?
>> I had already spoken to Ryan asking him to clarify details on the
>> project and the crux would be the comparison of parameters. Now if
>> we use the above point, then we would have to individually check out
>> the best parameters to be used in particular libraries for the
>> database size range and then run the individual methods on those. 
>> Then we could score them on that basis (For classification
>> algorithms, it’s the fraction of correctly classified samples, for
>> regression algorithms it’s the mean squared error and for k-means
>> it’s the inertia criterion) or something along those lines(not too
>> sure about this, as I don't have experience with all the libraries
>> being tested). 
>> Please let me know what you think about this and any further
>> suggestions would be most welcome. 
>> 
>> [0] http://about.wise.io/blog/2013/07/15/benchmarking-random-forest-part-1/

> 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cc.gatech.edu/pipermail/mlpack/attachments/20140228/8f108dab/attachment-0003.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 4972 bytes
Desc: not available
URL: <http://mailman.cc.gatech.edu/pipermail/mlpack/attachments/20140228/8f108dab/attachment-0003.bin>