[mlpack] Cross-validation and hyper-parameter tuning infrastructure

Kirill Mishchenko ki.mishchenko at gmail.com
Sat Apr 1 01:55:45 EDT 2017


Hi Ryan.

I’m planning to implement the following functionality as a GSoC project:
Measurements
Accuracy
Mean squared error
Precision
Recall
F1
Validation
Simple validation (splitting data once with validation set size specified by a user)
K-fold cross validation
Hyper-parameter tuning
Grid search based tuning

Does it seem as a reasonable set of functionality? I have decided to include simple validation since it can be more appropriate when we have a lot of training data or when training is a time consuming process.

Best Regards, 

Kirill Mishchenko

> On 30 Mar 2017, at 00:28, Ryan Curtin <ryan at ratml.org> wrote:
> 
> On Wed, Mar 29, 2017 at 06:15:16PM +0500, Kirill Mishchenko wrote:
>> Thanks for your answer, I was thinking about kind of the same solution.
>> 
>> I have yet another question, an organisational one. There are several
>> phases for evaluation during coding under the GSoC program. Namely,
>> there are three: in the end of June, in the end of July and in the end
>> of August. My question is should I prefer to plan coding in the way
>> that I finish implementing some logical part of functionality by the
>> start of an evaluation phase? 
>> 
>> For example, suppose I plan to implement functionality A1, A2, and B
>> such that I need spend approximately a week for functionality A1,
>> another week for A2, and 3 weeks for B. Also suppose that A1 and A2
>> correspond to one logical module, and B can be tested (and
>> implemented) when either A1 or A2 is implemented (at least one of
>> them). If I decide to implement them in the order A1, B, A2, I will
>> likely finish B approximately by the start of the first evaluation
>> phase. On the other side, If I decide to implement in the order A1,
>> A2, B, I will likely be unable to finish B by the start of the first
>> evaluation phase. So, what plaining in the described situation I
>> should prefer? A more logical one (A1, A2, B) or evaluation phase
>> oriented (A1, B, A2)? Or does it just depend on my preferences?
> 
> Hi Kirill,
> 
> That's up to you.  From the side of the mentor, at least to me, as long
> as you are making reasonable progress towards your project goals or
> project timeline, it is no problem for the evaluation phase.  Typically
> the "midterm evaluation" doesn't line up with some particular goal, so
> don't feel obligated to shape your proposal specifically around the
> midterm date.
> 
> Let me know if I can clarify anything.
> 
> Thanks,
> 
> Ryan
> 
> -- 
> Ryan Curtin    | "What? Facts?"
> ryan at ratml.org <mailto:ryan at ratml.org> |   - Joe Cairo

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://knife.lugatgt.org/pipermail/mlpack/attachments/20170401/06ccd52e/attachment.html>


More information about the mlpack mailing list