[mlpack] [GSOC '16] : Student interested in Low rank/sparse optimization using Frank-Wolfe

Fri Mar 11 12:20:30 EST 2016

Dear Ryan Curtin,

(This may be a duplicate - I sent the same email from my gmail account, which I suspect got filtered by the mailing list.. so I am resending it from my email registered w/ the mailing list) 

Thank you so much for your quick and thorough reply! I had some follow-up questions, and would really appreciate if you could spend a moment to help me understand the library better.

I saw that there are documentations for optimizers at http://mlpack.org/doxygen/, albeit they are for V1.0.12. Would documentation of optimizer architecture for V2.0.1 be of interest to you as something I can do as a part of the project? Anyhow, looking through doxygen still gave me a good idea of the architecture. I understood that wrapping optimizers in *_function.hpp (e. g. aug_lagrangian_function wrapping aug_lagrangian) enables it to be used by other optimizers. What I noticed were the absenses of test functions for some optimizers (e.g. sa, sgd). Is there a reason for this? Also, is there a reason for naming tests under aug_lagrangian to be “aug_lagrangian_test_functions” versus just “test_functions” (as done under lbfgs and sgd)? 

I would also like to look at how the artificial neural network code were modified to use the optimizers. This would be help me better understand how to carry out your suggestion for modifying programs like mlpack_softmax_regression to fit the Frank-Wolfe optimizer. From which version to another did the nn code change? Also, where may I find the nn code in the library? I suspect it is under src/mlpack/methods/sparse_coding.. but I also saw that we used to have ann module, which I do not see in the current version.

I do agree - Charles is a great lecturer! He hasn't mentioned mlpack - what brought me here google queries for "georgia tech machine learning", etc :) I do think he should include mlpack as a part of "Suggest Software" under our class syllabus! We could suggest the idea, as I suspect there must be C++ users in the class. As of now we have an assignment on implementing optimizers such as randomized hill climbing (RHC), MIMIC, SA and genetic algorithms (GA). Would implementing RHC, MIMIC and GA be of future interests for mlpack?

________________________________________
From: Ryan Curtin <ryan at ratml.org>
Sent: Tuesday, March 8, 2016 10:00 AM
To: Kwon, Jin Kyoung
Cc: mlpack at cc.gatech.edu
Subject: Re: [mlpack] [GSOC '16] : Student interested in Low rank/sparse optimization using Frank-Wolfe

On Tue, Mar 08, 2016 at 05:56:02AM +0000, Kwon, Jin Kyoung wrote:
> Hello everyone,
>
> My name is Jin Kyoung Kwon and I am a 3rd year Computer Science major
> at Georgia Institute of Technology, interested in contributing to this
> project through GSOC. I am currently enrolled in Dr. Isbell's Machine
> Learning course at GT and have enjoyed the course so much so that I am
> aspiring to spend my summer learning more about ML and doing related
> work. I am specifically interested in the low rank/sparse optimization
> using Frank-Wolfe, as I am interested in optimization problems as a CS
> major concentrating in modeling/simulation. I am also familiar with
> non-convex optimization problems through learning about randomized
> optimization topics as well as a bit of the math through my coursework
> in differential equations and linear algebra. I am willing to learn
> more advanced math topics as well as gain skills in C++ as I believe
> the project will be a good fit for me.
>
> I have built the mlpack library on my machine, learned how the library
> works through running simple execs, and read Dr. Martin Jaggi's paper
> about the topic. I wanted to gain a better sense of how the
> implementation of the framework would fit into the existing one. I saw
> that we already have methods for core optimizers (core/optimizers) as
> well as problems that variants of Frank-Wolfe algorithms can be
> applied to, such as Lasso (methods/lasso), matrix completion
> (methods/matrix_completion), and classic learning tasks (boosting,
> svm, etc). What I think is that our framework would go under
> core/optimizers, but I am not sure. I wanted to ask, how are you
> envisioning the Frank-Wolfe algorithms to fit in with the current
> architecture?
>
> I am incredibly excited to continue learning about the library, and I
> am looking forward to hearing from you!

Hi Jenna,

Thanks for getting in touch.

I'm glad you're enjoying Charles' class; I think he's a great lecturer.
I hope that he has mentioned mlpack in his class and that is what
brought you here... :)

The current optimizer architecture is unfortunately not documented in
its own wiki page or doxygen page, but I can explain it to you a little
bit here and point you towards relevant code:

Optimizers in mlpack are stored in src/mlpack/core/optimizers/; right
now, we have implementations of SGD, mini-batch SGD, L-BFGS, simulated
annealing, RMSprop, the augmented Lagrangian method, and a low-rank
algorithm for semidefinite programming (LRSDP).  I think there are more
that will be there soon, since the artificial neural network code uses
optimizers that now match the API of the rest of the optimizers.

Each optimizer takes (at least) one template parameter, which is the
function type to be optimized:

template<typename FunctionType>
class Optimizer;

The function to be optimized must provide at least two functions:

  double Evaluate(const arma::mat& coordinates);
  void Gradient(const arma::mat& coordinates, arma::mat& gradient);

and those functions, given some input coordinates, will calculate the
objective function or the gradient.  This should be documented fairly
well in the comments in each file, and the code should be fairly easy to
read to figure out what is going on (I'd suggest taking a look at SGD
first since it's so simple).

My hope is that a project implemeting the Frank-Wolfe algorithm would
result in an optimizer of this type ending up in
src/mlpack/core/optimizers/ and then possibly the various programs like
mlpack_softmax_regression, mlpack_logistic_regression, and others being
modified so that they could use the Frank-Wolfe optimizer.

LARS actually does not support other optimizers; it has a hard-coded
version of (I think) Newton's method in there somewhere.  It would be
great to refactor it, pull Newton's method out, and allow an arbitrary
optimizer, but I don't think anyone's ever had the time (or motivation)
to do that.

I remember that I just wrote more about this in another email but I
can't seem to find the link.  Maybe if you search the archives you will
be able to find something more?
https://mailman.cc.gatech.edu/pipermail/mlpack/

I hope this information is helpful; let me know if I can clarify
anything.

Thanks,

Ryan

--
Ryan Curtin    | "Avoid the planet Earth at all costs."
ryan at ratml.org |   - The President