[mlpack] GSoC 2014 simulated annealing optimizer

Mon Apr 7 11:59:13 EDT 2014

On Fri, Apr 04, 2014 at 08:25:19PM -0500, Zhihao Lou wrote:
> Hi Ryan
> 
> On Fri, Apr 4, 2014 at 4:56 PM, Ryan Curtin <gth671b at mail.gatech.edu> wrote:
> 
> >
> >  * Could we templatize the probability distribution from which the next
> >    sample is chosen, and also the cooling schedule?  This way we can
> >    have an interface that looks like this:
> >
> >    template<typename FunctionType, typename DistributionType, typename
> >        CoolingScheduleType>
> >    class SA;
> >
> >    This will allow more flexibility to the user in the exact way they
> >    want to use simulated annealing.  You could take what you already
> >    have, which is a uniform distribution and a geometric cooling
> >    schedule, and split them out into a UniformDistribution and
> >    GeometricCoolingSchedule class.
> >
> 
> I'm certainly willing to do that. The problem is, however, how to
> design the interface between these separate classes and the main loop
> of the annealing.
> 
> For example, the geometric cooling schedule only needs current
> temperature to calculate next step's temperature. The same is true for
> linear and logarithmic schedule, though I don't think anybody should
> use these two. But the other major class of cooling schedules is
> adaptive schedules, which usually require additional information like
> the current value of the cost function (usually for calculating
> variance etc), and in Lam's schedule (see my comments in MoveControl)
> requires the boolean whether last proposed move has been accepted. So
> it is very hard to anticipate what information the cooling schedule
> will need. The solution I can think of is to pass the SA itself to the
> schedule, but this is ugly.

You are right, passing the SA object itself is not a clean solution.
When designing abstractions like this it is often difficult to predict
what parameters the templated class will need, like you have pointed
out.  In this case, I think the best idea is to produce an abstraction
that takes the current temperature and current objective function value
(it is probably possible to determine the necessary information for
Lam's schedule by tracking what the objective function was the last time
the function was called).  If it needs to be modified later for some
type of schedule we did not anticipate, we can do that later.

> The other thing is that the actual amount of change move in
> generateMove() is actually a double exponential (Laplace distribution)
> calculated from uniform intermediate unif. (I probably need more
> comments there.) The double exponential distribution is related to the
> move control and the 0.44 value.  I'll suggest not to change this.

That's fine, but if we can templatize that into the DistributionType
parameter, that would be great.

> >  * Can you comment what is going on in GenerateMove() and MoveControl()
> >    a little better?  I can sort of follow what you are doing, but it
> >    takes a while to figure it out, and a couple informative comments could
> >    make it much easier to read.
> >
> 
> Sure. I'll going to work on these right now.

Thanks!

> >  * I'd like to add your name to the list of contributors once I work
> >    this in.  Do you mind if I do this?
> 
> That will be great!

Ok; I will do that when we finish the design of the optimizer and commit
it.

Thanks,

Ryan

-- 
Ryan Curtin    | "None of your mailman friends can hear you."
ryan at ratml.org |   - Alpha