[mlpack] Evolution strategies along with policy gradients

Wed Mar 7 10:33:26 EST 2018

Hello Chirag,

the Conventional Neural Evolution method isn't exactly comparable to the Natural
Evolution Strategy, which is much simpler, for example, there is no mutation or
crossover operation.

Thanks,
Marcus

> On 7. Mar 2018, at 14:08, Chirag Ramdas <chiragramdas at gmail.com> wrote:
> 
> 
> It might make sense to implement the Natural Evolution Strategie as an
> optimizer, see mlpack.org/docs/mlpack-git/doxygen/optimizertutorial.html <http://mlpack.org/docs/mlpack-git/doxygen/optimizertutorial.html> and
> arxiv.org/abs/1711.06581 <http://arxiv.org/abs/1711.06581> for more information. Let me know what you think.
> 
> Makes sense. I have a few questions about this. I was going through the existing optimizers, and found this https://github.com/mlpack/mlpack/tree/master/src/mlpack/core/optimizers/cne <https://github.com/mlpack/mlpack/tree/master/src/mlpack/core/optimizers/cne>
> 
> Has natural evolution strategies already been implemented in this, or will I have to implement it separately, referring to this existing implementation?
>> 
> 
> Agreed, really like the idea to combine RL with Neuroevolution, also
> https://github.com/mlpack/mlpack/wiki/Google-Summer-of-Code-Application-Guide <https://github.com/mlpack/mlpack/wiki/Google-Summer-of-Code-Application-Guide>
> might be helpful.
> 
> Let me know if I should clarify anything.
> 
> Thanks,
> Marcus
> 
>> On 3. Mar 2018, at 16:31, Chirag Ramdas <chiragramdas at gmail.com <mailto:chiragramdas at gmail.com>> wrote:
>> 
>> Hello Marcus,
>> 
>> Following up on my previous email, where I mentioned finding this idea very interesting
>> https://arxiv.org/abs/1802.04821 <https://arxiv.org/abs/1802.04821>
>> 
>> So in the past three days, I have been going through OpenAI's blog on Evolution strategies as well their paper.
>> https://arxiv.org/abs/1703.03864 <https://arxiv.org/abs/1703.03864>
>> https://blog.openai.com/evolution-strategies/ <https://blog.openai.com/evolution-strategies/>
>> 
>> The blog post is very well written, and brings out the simple yet beautiful way in which evolution strategies work.
>> 
>> In terms of the paper in general, where they have combined evolution strategies along with policy gradients, I feel it would be a nice addition to the existing code base of mlpack.
>> 
>> I could implement a basic evolution strategies module within the src/mlpack/methods/reinforcement_learning module or as a separate module itself, and test it on sample functions for a start ( reference : https://gist.github.com/karpathy/77fbb6a8dac5395f1b73e7a89300318d <https://gist.github.com/karpathy/77fbb6a8dac5395f1b73e7a89300318d>)
>> 
>> After that, i could go on and implement the idea suggested in the paper, which combines it with a policy gradient technique.
>> 
>> Since the paper suggests that their results are at par with state of the art TRPO/PPO, we could also benchmark the performance of this technique against a standard MuJoCo environment. 
>> 
>> All in all, I feel I can form a proper timeline to try to fit this in the timeframe of the summer.
>> 
>> Do let me know what you feel about this, and if it appeals to you!
>> 
>> Thanks a lot!
>> 
> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://knife.lugatgt.org/pipermail/mlpack/attachments/20180307/d0ef40e3/attachment.html>