[mlpack] Profiling for parallelization

Ryan Curtin ryan at ratml.org
Wed Mar 21 14:53:28 EDT 2018


On Wed, Mar 21, 2018 at 12:20:05AM +0530, Nikhil Goel wrote:
> Hi Ryan
> 
> Thank you for your help. I've submitted the draft of my proposal and it
> would be really helpful if you could review it and tell me the changes I
> should make.
> My main concerns regarding my proposal are -
> 1) The number of algorithms/functions I've chosen. I'm trying to research
> more but if you can tell your thoughts on the number of algorithms I've
> chosen, it would be really helpful.
> 2) I looked into logisitic regression, and it is using SGD and L-BFGS.
> Parallel-SGD has been implemented in mlpack but I'm unsure if that will
> actually provide a significant speedup as the parallelization is already
> there at low levels. Do you think it will be worst investing my time into?
> Should I mention it in my GSoC proposal?
> 3) Similar kind of problem for naive bayes. I've figured out the for loops
> that should be parallelized but the papers I followed showed no significant
> performance improvement in parallel naive bayes. Should I mention this in
> my proposal?
> 4) How much change is permitted before I should make another file for
> parallel implementation of the algorithm?
> 5) I've dropped the idea of providing API since you're right, it will be
> better for the user to learn openMP as it's pretty famous.
> 6)I've added bagging in my proposal. So I'll implement and parallelize it.
> I hope that's fine.

Hi Nikhil,

Thanks for the update.  I don't know how quickly or slowly you work, so
I can't provide much input on how many algorithms you should do---this
part is up to you.

Personally I think that since parallel SGD is already implemented, it's
not necessary to focus on it.  It's hard to say how much change we
should do before making another file.  I would say, if we can apply
OpenMP directly to an algorithm in a way that does not fundamentally
change the algorithm, there is no reason to have a second parallel
implementation.

Thanks,

Ryan

-- 
Ryan Curtin    | "Lots of respectable people have been hit by
ryan at ratml.org | trains."  - Penny


More information about the mlpack mailing list