[mlpack] Regarding guidance for GSOC '18

Wed Mar 14 11:39:03 EDT 2018

Hello Marcus,

I have been trying to get upto speed with the papers that you mentioned in
the ideas page for Reinforcement Learning and mentioned there,was this
'gym_tcp_api'. Could you please elaborate on what it really does and what
'distributed actors/infrastructure' means here. As far as I get it, it
communicates with the OpenAI GYM and runs the algorithm I feed. When does
Elixir come in here?
Regarding 'Double DQN', am I right in assuming this requires knowledge of
deep reinforcement learning? I have been reading the text book by Sutton
and Barto (and following David Silver's lectures on the same) to strengthen
my basics. What other sources would you recommend to help me wrap my head
around the 3 points in the ideas-list? Is this
<http://rll.berkeley.edu/deeprlcourse/#lectures> a good starting point?
Also, I would like to implement the ideas I have picked up so far using
mlpack, so that I can understand them better and also use them as an
indicator of my level of comfort with C++ (which might be useful when I
submit my proposal.). Where do you suggest I start off (which algorithm?
what problem?), assuming that most of my current experience has been with
Python?
Any guidance would be of immense help.

Thanks,
Hitesh.

On Mon, Mar 12, 2018 at 12:57 AM, Marcus Edel <marcus.edel at fu-berlin.de>
wrote:

> Hello Hitesh,
>
> welcome and thanks for getting in touch.
>
> As far as Reinforcement Learning is concerned, I have played around with
> the
> cart-pole problem and have a working knowledge of q-learning. To be frank,
> I am
> not at all familiar with any of the 3 'recent ideas' given in the above
> link. I
> have tinkered a lot with Neural Networks in Python and built CNN's, RNN's
> and
> Variational Autoencoders using TensorFlow and worked on the standard
> problems of
> text and image classification and sentiment analysis. Also, I have taken a
> look
> at mlpack's existing RE implementations as directed by the ideas page. I
> believe
> that given adequate time and guidance I can pick up the ideas pretty well
> and
> quick.
>
>
> A good first step would be to get familiar with the ideas mentioned on the
> project page, you don't have to get every detail but a general idea would
> help
> in the proposal preparing step. If you have any questions, please don#t
> hesitate
> to ask. Also, note this is a C++ library, so you should be familiar with
> the
> common patterns used over the codebase; see mlpack.org/gsoc.html for more
> information.
>
> Thanks,
> Marcus
>
> On 11. Mar 2018, at 12:11, Hitesh Bhagchandani <
> hiteshbhagchandani39 at gmail.com> wrote:
>
> Hello! I am Hitesh Bhagchandani from BITS , Hyderabad, India, currently
> pursuing my 3rd year in computer science. I am writing to you regarding the
> idea : "Reinforcement Learning" that I came across here
> <https://github.com/mlpack/mlpack/wiki/SummerOfCodeIdeas#reinforcement-learning>
> .
>
> I intend to participate in this year's GSOC and would be very grateful if
> you could help me figure out the next steps I could take in preparing my
> proposal.
>
> As far as Reinforcement Learning is concerned, I have played around with
> the cart-pole problem and have a working knowledge of q-learning. To be
> frank, I am not at all familiar with any of the 3 'recent ideas' given in
> the above link. I have tinkered a lot with Neural Networks in Python and
> built CNN's, RNN's and Variational Autoencoders using TensorFlow and worked
> on the standard problems of text and image classification and sentiment
> analysis. Also, I have taken a look at mlpack's existing
> <https://github.com/mlpack/mlpack/tree/master/src/mlpack/methods/reinforcement_learning> RE
> implementations as directed by the ideas page. I believe that given
> adequate time and guidance I can pick up the ideas pretty well and quick.
>
> Regards,
> Hitesh Bhagchandani.
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://knife.lugatgt.org/pipermail/mlpack/attachments/20180314/2eccf1b9/attachment.html>