[mlpack] [GSoC '16] Looking to Contribute

Ryan Curtin ryan at ratml.org
Mon Mar 14 09:50:01 EDT 2016


On Sun, Mar 13, 2016 at 09:08:49AM +0530, Rebeka Mukherjee wrote:
> Hello,
> 
> My name is Rebeka Mukherjee. I am a CS undergraduate at Netaji Subhash
> Engineering College, India. I am very eager to work with mlpack, possibly
> through GSoC.
> 
> I am interested to work on Decision Trees. I have been studying and working
> on classification and clustering algorithms for a while now. Currently I am
> on a project that is about building Hadoop MapReduce tools for K-Means
> algorithm with Expectation Mazimization, Fuzzy C-means and Genetic
> Algorithm.
> 
> To get started, I have already built the mlpack library and compiled the
> tutorial programs so I have a fair bit of idea about it now. I have also
> been through the code of decision stumps code that was added to mlpack.
> Instead of building on it to implement a full blown tree, I would prefer to
> write a new class on one the efficient decision tree algorithms.
> 
> I have been through a few literature online that is available for
> implementation and modification of ID3 and Random Forests. I have also gone
> through the paper on Density Estimation Tree in the mlpack website. Please
> let me know if there is any literature you would want me to refer to me.

Hello Rebeka,

Thanks for getting in touch.  You might consider reading papers on C4.5
or C5 or the original CART paper or other decision tree papers.  When
you prepare your proposal, it is important to consider the API that you
will design, and the tests you will use to validate the correctness of
your algorithm.  So be sure to put some thought and time into those.

The design guidelines might be helpful to look over:

https://github.com/mlpack/mlpack/wiki/DesignGuidelines

Thanks,

Ryan

-- 
Ryan Curtin    | "It's too bad she won't live!  But then again, who
ryan at ratml.org | does?" - Gaff



More information about the mlpack mailing list