[mlpack] Interest in contributing to mlpack

Tue Feb 5 20:49:32 EST 2019

On Tue, Feb 05, 2019 at 10:57:35AM -0600, Kungang Zhang wrote:
> Hello, 
> 
> My name is Kungang Zhang, currently studying in PhD program in
> Northwestern University, Evanston, IL. My research including
> statistics, machine learning, optimization, and artificial
> intelligence.

Hi Kungang,

Welcome to the community!

> Currently, I am specifically interested in research in
> hyper-parameter tuning and efficient implementation of those
> algorithms. With more and more data and data stream, models is getting
> increasingly complex and optimizing them becomes very costly for a set
> of hyper-parameters. To find the best hyper-parameter is critical for
> good performance. This is an active field of research right now, but
> not many good and efficient implementations can be found out there.
> Cross-validation (or simple validation) is usually the to-go method,
> but too costly for large-scale model, limiting amount of data points,
> and online learning problems. Currently I am interested in
> implementing new algorithms to automate this tuning process, not only
> for categorical hyper-parameters, but also for continuous
> hyper-parameters. According to my research there are several methods
> but no definite answer which one is the best, so that implementing
> them in mlpack can help exploration of new ideas and new datasets and
> definitely improving the diversity in algorithms for hyper-parameter
> tuning.

I definitely agree with this.  Kirill's hyperparameter tuner is great
and paves the way for the implementation of lots of interesting
hyperparameter optimizers via ensmallen's categorical functions.

Once upon a time I was trying to implement SMAC and ROAR and some other
algorithms, but I ended up running out of time before it worked, and my
research ended up going different directions than hyperparameter tuning.

In any case, though, I think it would be really great if someone was
interested in implementing these things.  It would be a nice GSoC
project to implement them and to also document them sufficiently that
new users can come along and effectively use the hyperparameter
optimizer with ensmallen optimizers for their machine learning tasks.

> This idea is related to my interest in reinforcement learning, because
> I got this idea from my interest in multi-arm bandit problem (a simple
> version of RL) and my last internship. It is kind of being proved
> working in real applications, but of course efficient implementation
> and new ideas are worth of more effort. I have reading mlpack mailing
> list for a while and think I can learn from and contribute to this
> community by doing this project (besides day-to-day interaction). I am
> considering applying GSoC 2019, even though there is no detailed
> project about hyper-parameter tuning in the idea list yet. Any advice
> on how to prepare ideas and proposals for this is very welcome.

There's no need for anything to be in the Ideas List for you to propose
it---that is just a starting point for ideas. :)  We have a long time
until proposals have to be submitted, so there is still a lot of time.
Mostly what I look for in a proposal is a clear description of the idea
to be implemented, and a clear timeline that is reasonable.  There's
some more information on this wiki page:

https://github.com/mlpack/mlpack/wiki/Google-Summer-of-Code-Application-Guide

> Also, I am currently interested in Reinforcement Learning. I also want
> to implement efficient algorithms for RL package and may be try some
> new ideas. Thank you very much!

Sounds good---I am not that much of an expert in mlpack's RL code so I
can't comment too much, but definitely it is an interesting and exciting
area of activity.

Thanks!

Ryan

-- 
Ryan Curtin    | "What? Facts?"
ryan at ratml.org |   - Joe Cairo