[mlpack] GSoC 17: Interested in the Reinforcement learning project

Marcus Edel marcus.edel at fu-berlin.de
Wed Mar 1 12:15:18 EST 2017


Hello Arun,

> At first, Congratulations on being accepted for GSoC 2017.


Thanks and welcome, looking forward to have a lot of fun over the summer.

> I am Arun Reddy, thrid year PhD Student in machine learning at Arizona State
> University, USA. My current area of research is Transfer learning/Domain
> adaptation using Deep Learning, specifically on the problem "Is human expertise
> transferable?".

That sounds really interesting, I guess, the "Reinforcement Learning" project
idea goes kinda in a similar direction and would fit?

> I have a good understanding of Neural Networks & Reinforcement learning(RL), and
> would like to apply for the "Reinforcement Learning" project. I have done the
> relevant coursework at my university[1], and did the David Silver's course[2] as
> well. During the coursework, I learned how the agents interact with the
> environment and the underlying challenges through Edx Pacman projects[3] and
> also the implemented famous Atari Deep RL paper[4]. I am currently working in
> the direction of Reinforcement learning(RL) and adaptation, investigating if it
> is possible to improve the model learned by agents through interaction by
> scaffolding with existing models. Contributing to this project will help me to
> get a hands-on and a deep understanding of the existing DeepRL algorithms.  I am
> looking forward to contribute to mlpack, with a motive to get my hands dirty,
> learn to write efficient and maintainable code from scratch, and be part of the
> open source community.

I didn't know about the Pac-Man project, the code examples and clear directions
are really nice. Also, since you pointed out some really interesting references,
have you seen "Deep Reinforcement Learning: An Overview" by Yuxi Li, it's a
really comprehensive overview.

> I was able to successfully compile the code and run few tests. Also got the
> gym_tcp_api working in my local environment. As suggested on the mailing list by
> Marcus, I would like to start off by contributing to few existing issues and
> move on to implementing policy gradients to get a hang of mlpack.

Starting with a simple method like stochastic or deterministic Policy gradients
is a really good idea, I think Temporal Difference Learning is another approach
that might be manageable and interesting.

Thanks,
Marcus

> On 28 Feb 2017, at 21:22, Arun Reddy <arunreddy.nelakurthi at gmail.com> wrote:
> 
> Hello Devs and fellow GSoC enthusiasts,
> 
> At first, Congratulations on being accepted for GSoC 2017.
> 
> I am Arun Reddy, thrid year PhD Student in machine learning at Arizona State University, USA. My current area of research is Transfer learning/Domain adaptation using Deep Learning, specifically on the problem "Is human expertise transferable?". 
> 
> I have a good understanding of Neural Networks & Reinforcement learning(RL), and would like to apply for the "Reinforcement Learning" project. I have done the relevant coursework at my university[1], and did the David Silver's course[2] as well. During the coursework, I learned how the agents interact with the environment and the underlying challenges through Edx Pacman projects[3] and also the implemented famous Atari Deep RL paper[4]. I am currently working in the direction of Reinforcement learning(RL) and adaptation, investigating if it is possible to improve the model learned by agents through interaction by scaffolding with existing models. Contributing to this project will help me to get a hands-on and a deep understanding of the existing DeepRL algorithms.  I am looking forward to contribute to mlpack, with a motive to get my hands dirty, learn to write efficient and maintainable code from scratch, and be part of the open source community.
> 
> I was able to successfully compile the code and run few tests. Also got the gym_tcp_api working in my local environment. As suggested on the mailing list by Marcus, I would like to start off by contributing to few existing issues and move on to implementing policy gradients to get a hang of mlpack.
> 
> [1] http://rakaposhi.eas.asu.edu/cse571/ <http://rakaposhi.eas.asu.edu/cse571/>
> [2] http://www0.cs.ucl.ac.uk/staff/d.silver/web/Teaching.html <http://www0.cs.ucl.ac.uk/staff/d.silver/web/Teaching.html>
> [3] http://ai.berkeley.edu/project_overview.html <http://ai.berkeley.edu/project_overview.html>
> [4] https://arxiv.org/abs/1312.5602 <https://arxiv.org/abs/1312.5602>
> 
> 
> Happy coding,
> Arun
> _______________________________________________
> mlpack mailing list
> mlpack at lists.mlpack.org
> http://knife.lugatgt.org/cgi-bin/mailman/listinfo/mlpack

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://knife.lugatgt.org/pipermail/mlpack/attachments/20170301/9aa9d928/attachment.html>


More information about the mlpack mailing list