[mlpack] Parallel Stochastic Optimization Methods

Fri Mar 11 08:29:51 EST 2016

On Fri, Mar 11, 2016 at 06:40:57PM +0530, Lokesh Jain wrote:
> Hi
> 
> I am Lokesh Jain, 4th year undergraduate B.E(Hons.) Computer Science and
> M.Sc.(Hons.) Mathematics student at BITS Pilani, India. I am interested in
> the project Parallel Stochastic Optimization Methods. I have worked over
> projects on OpenMP, PThreads, MPI and CUDA. I had a doubt about the kind of
> parallelism expected in the algorithm as in Stochastic Gradient Descent
> there is a dependency in weights. The weights to be updated for one
> training example depend on the updated weights from the previous training
> example. So how do we parallelize the algorithm in terms of the iterations
> over the training set? Do we parallelize the algorithm in terms of step
> size as different threads running the algorithm for different step size?

Hi Lokesh,

You are right that parallelizing SGD directly might have problems
because of weight dependencies.  But there are other parallelized SGD
variants which are able to work around this.  For an example, you should
check out the Hogwild! paper:

http://papers.nips.cc/paper/4390-hogwild-a-lock-free-approach-to-parallelizing-stochastic-gradient-descent.pdf

I hope this is helpful.

Thanks,

Ryan

-- 
Ryan Curtin    | "He takes such wonderful pictures with
ryan at ratml.org | his paws."  - Radio man