[mlpack] GSOC 2013 Proposal

Ryan Curtin gth671b at mail.gatech.edu
Tue Apr 9 23:47:07 EDT 2013


On Wed, Apr 10, 2013 at 08:49:12AM +0530, Siddharth wrote:
> Hi,
> 
> I am currently pursuing MS by Research in the field of image processing and
> machine learning at IIIT- Hyderabad, India. I am interested in working for
> mlpack this summer in a GSoC project.
> 
> I propose to implement techniques for dimensionality reduction and metric
> learning.
> 
> http://homepage.tudelft.nl/19j49/Matlab_Toolbox_for_Dimensionality_Reduction.html
> 
> I think this will be a good addition to mlpack.

Which techniques are you planning to implement?  Many of those nonlinear
dimensionality techniques are slow.  Instead what would be more
interesting is a working implementation of MVU with LRSDP (see the end
of the ideas list).

There is no purpose in mlpack implementing a variety of methods poorly.
Breadth is important but it cannot come at the cost of quality.

If you are going to propose implementing all (or even some) (or more
than one) of these methods, you will have to convince us that you can
implement all of them effectively.  And with tests.  Proper unit tests
are very important, and, testing machine learning methods for accuracy
-- especially when most of the "tests" available are just reference
implementations which could contain bugs -- is a rather difficult task
which can take far longer than implementing the methods itself.

Also, if you were planning to just wrap the MATLAB functions or
"translate" them to C++, that's not acceptable.  The implementation must
be from the ground up and needs to be provably faster than any existing
implementations to be accepted into the mlpack codebase (of course, you
can't know that when you propose the project, but if you have papers
which detail faster algorithms to implement, those are a good start).

> There are also many kernel functions which have not been implemented.
> 
> http://crsouza.blogspot.in/2010/03/kernel-functions-for-machine-learning.html

I could implement these, fully with tests, in probably a day and a half.
This could be an addendum onto an existing project but is too simple for
someone with knowledge as advanced as yourself.

Hopefully these are helpful answers.  Let me know if I can clarify
further. 

-- 
Ryan Curtin       | "Gentlemen, you can't fight in here!  This is the
ryan at igglybob.com | War Room!" - President Muffley



More information about the mlpack mailing list