[mlpack] Algorithm Optimization-GSoC'18 Idea

Ryan Curtin ryan at ratml.org
Mon Mar 5 09:58:11 EST 2018


On Sun, Mar 04, 2018 at 02:04:00PM +0530, Prashanth Duvvada wrote:
> Hello,
> 
> My name is Prashanth Duvvada and I'm currently a 2nd year Computer Science
> student of Amrita University, India.
> 
> I'm interested in working on the GSoC 2k18 project "Algorithm Optimization"
> and would like to work on improving the K Means algorithm.

Hi Prashanth,

Thanks for getting in touch.  k-means++ would be a useful initialization
method to add but do be aware that this is orthogonal to the use of
Lloyd's algorithm.  In addition, mlpack does implement about five other
schemes than the standard Lloyd's algorithm for acceleration.

> 1) With K-Means++, it's sure to find the solution in O(log k).

This isn't true.  k-means++ makes no guarantees on the runtime after
seeding.

In any case, I would be happy to review and eventually merge an implementation of
k-means++, but I don't think that focusing on the k-means algorithm
alone can make a good GSoC project because I am not sure that there is
enough work to do.  However, if you can identify many places where you
think that you will be able to improve the runtime of the algorithm and
that this work will take roughly 10-15 full-time weeks, then I think it
would be possible to put together a proposal.

Thanks,

Ryan

-- 
Ryan Curtin    | "Gentlemen, you can't fight in here!  This is the
ryan at ratml.org | War Room!" - President Muffley


More information about the mlpack mailing list