[mlpack] looking for guidance and opinions

Ryan Curtin ryan at ratml.org
Mon Nov 28 10:26:19 EST 2016


On Sat, Nov 26, 2016 at 04:41:43PM +0530, Anuraj Kanodia wrote:
> Hello,
> I am Anuraj, a Computer Science(B.E.) and Mathematics(M.Sc.) student.
> I have been going through the mlpack codebase since quite some time now
> (since this September to be precise). I am already familiar with the mlpack
> workflow as i have made a few contributions to mlpack.
> 
> /* a little background */
> I am approaching the final stages of my graduation (currently in pre-final
> year), so i have started looking for topics for my thesis. After exploring
> a lot of options, i finally see my search concluding with ML.
> I went through the ideas page on the mlpack wiki and also went through some
> research papers to get a brief overview of the different aspects of ML.
> In particular, i went through some chapters of Ryan Curtin's dissertation
> (Improving Dual Tree Algorithms). I will be using it as part of my
> study-oriented project in the next semester. This project will help me
> further refine my thesis topic.
> In the meantime, I also implemented a few ML classifiers such as Decision
> Trees(using ID3 algorithm), Neural Networks(backpropagation algorithm) for
> face/pose/sunglass recognition, and the Naive Bayes Algorithm (for face
> recognition). The code for the above can be found in my github profile (
> [1] ). Do note that the focus was on implementation rather than on writing
> a highly optimized code.
> Out of these three, ANNs really caught my attention. The power of Neural
> Networks amazes me and i see this as another potential topic for my thesis
> (especially after coming across Google's 'Quick, Draw!' ( [2] )).
> 
> /* the point */
> So, right now i am interested in these two topics:
> 
>    1. Dual Tree Algorithms.
>    2. Neural Networks.
> 
> *I would be grateful to you if you could recommend some relevant sources
> which would further shed some light on these topics. I would also like to
> hear your opinion on these topics. *
> 
> I already plan on completing this course on neural networks over the
> winter: https://www.coursera.org/learn/neural-networks
> 
> Also, the 'Essential Deep Learning Modules' project from the ideas page is
> relevant here (from what i understand, it wasn't taken up by anyone). I
> think this project will give me a chance to learn about different
> fundamental networks. I am yet to take a look at the relevant tickets and
> references mentioned there though. *Are there any other sources i should
> refer in order to prepare for the project? *
> 
> I look forward to hearing from you.

Hi Anuraj,

As for dual-tree algorithms, probably the best references are the thesis
I wrote, which you already said you read, and the papers that it cites.
A lot of those papers are actually implemented as part of mlpack (like
nearest neighbor search, FastMKS, minimum spanning tree calculation, and
so forth), so you can refer to the code for implementation and the paper
for theory.

Here's a long writeup by Marcus from last year on the Essential Deep
Learning Modules project:

http://lists.mlpack.org/pipermail/mlpack/2016-March/000842.html

I hope this is helpful; let me know if I can clarify anything.

I realize also that I owe you some comments on an open PR; I'll take
care of that next...

Thanks,

Ryan

-- 
Ryan Curtin    | "Gentlemen, you can't fight in here!  This is the
ryan at ratml.org | War Room!" - President Muffley


More information about the mlpack mailing list