[mlpack] Cross-validation and hyper-parameter tuning infrastructure

Wed Apr 26 11:17:27 EDT 2017

On Wed, Apr 26, 2017 at 11:24:18AM +0500, Kirill Mishchenko wrote:
> Hi Ryan.
> 
> > The key problem, like you said, is that we don't know what AuxType
> > should be so we can't call its constructor.  But maybe we can adapt
> > things a little bit:
> > 
> > template<typename AuxType, typename... Args>
> > struct Holder /* needs a better name */
> > {
> >  // This typedef allows us access to the type we need to construct.
> >  typedef AuxType Aux;
> > 
> >  // These are the parameters we will use.
> >  std::tuple<Args...> args;
> > 
> >  Holder(Args... argsIn) { /* put argsIn into args */ }
> > };
> > 
> > Then we could use this in addition with the Bind() class when calling an
> > optimizer:
> > 
> >  std::array<double, 3> param3s = { 1.0, 2.0 4.0 };
> >  std::array<double, 2> auxParam1s = { 1.0, 3.0 };
> >  std::array<double, 4> auxParam2s = { 4.0, 5.0, 6.0, 8.0 };
> >  auto results = tuner.Optimize<GridSearch>(Bind(param1), Bind(param2),
> >      param3s, Holder<AuxType>(auxParam1s, auxParam2s));
> > 
> > Like most of my other code ideas, this is a very basic sketchup, but I
> > think it can work.  Let me know what you think or if there is some
> > detail I did not think about enough that will make the idea fail. :)
> 
> I think this approach is quite implementable. Moreover, we should be
> able to provide support of Bind for aux parameters:
>  
>   std::array<double, 3> param3s = { 1.0, 2.0, 4.0 };
>   double auxParam1 = 1.0;
>   std::array<double, 4> auxParam2s = { 4.0, 5.0, 6.0, 8.0 };
>   auto results = tuner.Optimize<GridSearch>(Bind(param1), Bind(param2),
>      param3s, Holder<AuxType>(Bind(auxParam1), auxParam2s));

Yeah, that seems like it will work.  It might be worth spending some
time thinking about what would be the easiest for the user to
understand, but in either case the general implementation will be the
same.

> > Sure; I think maybe we should allow the user to pass in a DatasetInfo
> > with the training data and labels, to keep things simple.
> 
> Can you clarify a bit more what you mean here?

Yeah, my impression is that the user creates the hyperparameter
optimizer like this:

  HyperParameterOptimizer<...> h(data, labels);

My suggestion is to add another overload:

  HyperParameterOptimizer<...> h(data, datasetInfo, labels);

This is because I consider the dataset information, which encodes the
types of dimensions, to be a part of the dataset.  Not all machine
learning methods support a DatasetInfo object; I believe that it is only
DecisionTree and HoeffdingTree at the moment (maybe there is one more I
forgot).

> > // move optimizer type to class template parameter
> > HyperParameterOptimizer<SoftmaxRegression<>, Accuracy, KFoldCV, SA> h;
> > 
> > h.Optimizer().Tolerance() = 1e-5;
> > h.Optimizer().MoveCtrlSweep() = 3;
> > 
> > h.Optimize(…);
> 
> In this approach we need to construct an optimizer before the method
> Optimize (of HyperParamOptimizer(Tuner) in the example above) is
> called, and it can be very problematic because of two reasons.
>
> 1. We don’t know what FunctionType object (which wraps cross
> validation) to optimize since it depends on what we pass to the method
> Optimize (in particular, it depends on whether or not we bind some
> arguments).
>
> 2. In the case of GridSearch we also don’t know sets of values for
> parameters before calling the method Optimize. Recall that we pass
> these sets of values during construction of an GridSearch object.

Right, I see what you mean.  At the current time the mlpack optimizers
expect a 'FunctionType&' to be passed to the optimizer, and this
reference is held internally.  However, that design decision was made
before C++11 and was intended to avoid copies.  But now, we have C++11
and rvalue references, so we can do a redesign here to work around at
least the first issue: we can have the optimizers hold 'FunctionType',
and allow the user to pass in a 'FunctionType&&' and then use the move
constructor.

In that way, you could create an optimizer without having access to the
instantiated FunctionType.

I can see a few ways to solve the second issue after that change is
done.  But in either case, the goal from my end would be to avoid a big
long call to Optimize() that has both Bind(), Holder<>(), and
OptimizerArg() types all in it.  I think the idea of passing optimizer
arguments after the arguments to the machine learning algorithm and
marking them all with OptimizerArg() might be confusing for users, and
it's easier if they can directly modify the parameters of the optimizer.

> > If that's correct, then it might be nice to implement some additional
> > idea such as when the user passes a 'math::Range<double> lambda', the
> > search will be over all possible values of lambda within the given
> > range.  (One can simply modify the objective value to be DBL_MAX when
> > outside the bounds of the given lambda, or we can consider visiting how
> > optimizers can work in a constrained context.)
> 
> I think this behaviour should be handled by optimizers since we
> suppose to call them only once. I guess we already have touched this
> feature in the discussion about simulated annealing.

I agree; at the current time we don't have any support for constrained
optimizers though.  Whatever you end up implementing for GridSearch
might be a good start, since technically grid search is a special case
of constrained optimization.

> In the light of what we have discussed recently I think it is worth to
> revisit what and when can be implemented as a GSoC project. <...>

I agree with the changes that you have proposed.

Thanks again for the discussion, I think the ideas here are getting
really mature.  I think that there is some cool functionality that will
be possible with these modules that isn't possible in any other machine
learning library.  For instance, even just hyperparameter search over
continuous variables isn't very well supported by other toolkits, and
would be a really nice thing to showcase for mlpack.

Ryan

-- 
Ryan Curtin    | "You can think about it... but don't do it."
ryan at ratml.org |   - Sheriff Justice