[mlpack] Cross-validation and hyper-parameter tuning infrastructure

Tue Apr 4 13:28:24 EDT 2017

On Sat, Apr 01, 2017 at 10:55:45AM +0500, Kirill Mishchenko wrote:
> Hi Ryan.
> 
> I’m planning to implement the following functionality as a GSoC project:
> Measurements
> Accuracy
> Mean squared error
> Precision
> Recall
> F1
> Validation
> Simple validation (splitting data once with validation set size specified by a user)
> K-fold cross validation
> Hyper-parameter tuning
> Grid search based tuning
> 
> Does it seem as a reasonable set of functionality? I have decided to
> include simple validation since it can be more appropriate when we
> have a lot of training data or when training is a time consuming
> process.

Hi Kirill,

That sounds good to me.  It would be nice if the grid search optimizer
could have a similar API to the mlpack optimizers, so in that way,
possibly other optimizers could be used.  (i.e. you could use SGD to
find the best value of some continuous parameters.  This strategy falls
apart a little bit when the hyperparameters are not of continuous type,
so clearly we'll need to extend the OptimizerType paradigm at least a
little bit.  But we can figure that out when we get there.)

Thanks,

Ryan

-- 
Ryan Curtin    | "Do I sound like I'm ordering a pizza?"
ryan at ratml.org |   - John McClane