[mlpack] Implementation of HOGWILD! in mlpack

Sun Jun 25 06:00:45 EDT 2017

On Sat, Jun 24, 2017 at 05:30:57PM +0530, Shikhar Bhardwaj wrote:
> Thanks a lot for the insights, Ryan. I noticed I sent this as a reply
> directly to you, instead of the mailing list. Sorry for the extra mail.
> 
> The update in the description of the algorithm requires only those
> components of the decision variable to be updated which are related
> (affected by a part of the loss function). Updating the entire decision
> variable (in the vectorized fashion, as is done currently in sgd update
> policies) will not be lock-free, which is required by the algorithm.
> 
> I thought of two approaches to this :
> 
> 1. Update each component of the decision variable from the sparse gradient,
> not by a vectorized implementation, but by iterating over the entire
> gradient. The update will be done only for non-zero parts of the gradient.
> This will be atomic(a float subtraction).
> 2. Change the requirements of the function interface, to provide a
> *Components* function, which takes in a datapoint and returns a list of
> components for which the decision variable needs to be updated. The
> Optimizer only updates the components returned by the function.
> 
> I am trying to think if there are cases where either of these would be
> better. (The first one is clearly better from an API perspective because
> then the requirements on the function interface remain unchanged). Will
> there be cases where the 2nd approach will do less work?

Ah, I see, you are right that there is more to this than I originally
thought.

Your first idea is definitely the easiest to implement but not
particularly efficient.  The second idea would definitely be better, but
I would suggest that Hogwild use a Components() function, if available,
but otherwise use the first idea (despite its inefficiency).  You could
do this with a little bit of SFINAE work and probably some help from
src/mlpack/core/util/sfinae_utility.hpp to detect if the function type
has a Components() function.  Maybe that is helpful, maybe not, I am
just tossing ideas out there. :)

Another idea, which I don't think is feasible, is to adapt all
FunctionType classes to be able to take either an arma::mat or an
arma::sp_mat as the graident parameter.  Then, if Hogwild passed a
sparse matrix, you could easily just iterate over the nonzero elements
when you received that sparse gradient back.  However, I am not sure if
the sparse matrix support in Armadillo plus the refactoring effort
necessary here would actually make that worth it.

-- 
Ryan Curtin    | "Indeed!"
ryan at ratml.org |   - David Lo Pan