[mlpack] GSOC proposal for Profiling for parallelism.

Mon Mar 16 13:34:56 EDT 2020

On Sun, Mar 15, 2020 at 06:20:31PM +0530, Param Mirani wrote:
> Hello Marcus/Ryan,
> 
> I am Param Mirani and I want to contribute to MLpack by participating in
> GSOC 2020 in Profiling for Parallelism project. I have made some pull
> requests in trying to improve performance of the algorithms on multi-core
> processors using OpenMP. I had an idea and wanted to know your view on the
> same.

Hi Param,

Thanks for getting in touch and sorry that I still have not had time to
look into these PRs in detail.  I have some comments on these
algorithms:

> I wanted to work on following algorithms in the 12 week program based on my
> knowledge of machine learning algorithms.
> 
> 1)KNN

Dual-tree (and single-tree) algorithms can actually be parallelized
pretty well via the use of OpenMP tasks.  There were some other
trickinesses to tuning it, and I never tried it on multiple different
systems.  I did this experiment back in 2015, but for some reason I
never actually opened a PR for it or merged it in.  It gave not perfect
speedup but fairly close to it with ~4-8 cores.  So the strategy here
may not be too hard.

> 2)Decision Trees

I bet tasks would work well for this also, probably moreso than
low-level parallelization of building each individual node.  Note that
random forests actually should be parallelized already at the highest
level.

> 3)Perceptron

I wonder if the best thing to do here is refactor the Perceptron code to
use ensmallen optimizers; some of those are parallel.  Implementing
parallelized optimizers may also be useful.

> 4)Back-propagation

I'm not sure how useful OpenMP will be here.  Most of the heavy lifting
in training neural networks will already be in things like matrix
multiplication, etc., and since most people will be using OpenBLAS,
these will already be parallelized.

> I'd like to know if the timeline I've proposed is realistic, if all
> assumptions are correct, and if there are any more algorithms that you
> would like me to work on while working on this project.

Yes, I think that what you've proposed is reasonably realistic.  There
will certainly be a lot of trial and error and experimentation during
the process of figuring out the best way to parallelize something.  Some
techniques may work well on some systems and not well on others, making
the problem more difficult.

I hope the comments I've added here help.  It might be reasonable to
consider different algorithms in some cases, perhaps.

Thanks!

Ryan

-- 
Ryan Curtin    | "No... not without incident."
ryan at ratml.org |   - John Preston