[mlpack] Cross-validation and hyper-parameter tuning infrastructure

Fri Apr 14 16:29:49 EDT 2017

On Mon, Apr 10, 2017 at 11:13:50AM +0500, Kirill Mishchenko wrote:
> Hi Ryan,
> 
> I think I’m starting to see your perspective of how grid search
> optimiser should be implemented. But some concerns remain.

Hi Kirill,

Sorry for the slow response.

> 1. Some information (precision) can be lost during conversions between
> integer and floating-point values (e.g., during coding size_t value
> into a cell of arma::mat). It is not very likely to happen in practice
> (requiring very big values for integers), but it should be mentioned
> anyway.

Agreed.  I think with an IEEE 754 double precision floating point number
we get 2^54 possible values before loss of precision.

> 2. There are some other types of arguments in constructors for machine
> learning algorithms (models) beside numeric types and
> data::DatasetInfo. These include a template WeakLearnerType in
> AdaBoost, templates CategoricalSplitType and NumericSplitType in
> HoeffdingTree, std::unordered_map<size_t, std::pair<size_t, size_t>>*
> in HoeffdingTree, arma::mat in LARS. Some non-numerical types of
> arguments can also emerge in constructors of new machine learning
> algorithms.

Yes, this is a little bit more difficult.  In most of these situations
where a class instance is passed, it is usually so that the user can
specify some of the numeric parameters to those class instances.  For
instance the AdaBoost WeakLearnerType parameter is used to set the
parameters of each weak learner that is built.

So I can see two possibilities although maybe there are more:

 - Use template metaprogramming tricks to, given a type, expand all of
   its constructor arguments into a list of numeric types.  So say we
   had:

     Learner(double a, AuxType b)
     AuxType(double c, double d)

   we would ideally want to extract [double, double, double] as our list
   of types.  I can't quickly think of a strategy for this but it
   *might* be possible...

 - Refactor all classes that take an auxiliary class to instead take a
   template parameter pack to be unpacked into the auxiliary classes'
   constructors.  This will still be a fair amount of metaprogramming
   effort but I can see a closer route to a solution with this one.

What do you think?  Do you have any additional ideas?  Note that I have
not spent significant time thinking or playing with either of these
ideas so I am nnot fully sure if they will work.

> 3. In the case of hyper-parameter tuning  I guess a loss function
> should be a wrap for a cross validation class (we want to optimize
> performance on validation sets). But it is not clear what type of
> interface it should provide: DecomposableFunctionType (like for SGD)
> or FunctionType (like for SA or GradientDescent, all prerequisites for
> which can potentially be combined in one class).

I'm not sure I fully follow here, can you clarify?

Thanks,

Ryan

-- 
Ryan Curtin    | "This room is green."
ryan at ratml.org |   - Kazan