[mlpack] Reinforcement learning GSOC' 18

ROHAN SAPHAL rohansaphal at gmail.com
Tue Feb 20 13:05:40 EST 2018


Hi,

I am Rohan Saphal, a pre-final year undergraduate from Indian Institute of
Technology Madras.

My research interest is in Artificial Intelligence and specifically in Deep
reinforcement learning.
I have been working with  Prof. Balaraman Ravindran
<https://scholar.google.co.in/citations?user=nGUcGrYAAAAJ&hl=en> in
Multi-agent reinforcement learning and will continue to do my final degree
thesis project under his guidance.
I am currently a graduate research intern at Intel labs working on
Reinforcement learning.
Previously, I was a computer vision intern at Caterpillar Inc. As part of
the machine learning course,  a competition was organized among the
students and i have secured 1st place in that competition
<https://www.kaggle.com/c/iitm-cs4011/leaderboard>
I am familiar with deep learning and have completed the fast.ai MOOC course
along with course offered at our Institute.

I have read the papers related to the the reinforcement learning algorithms
mentioned in the ideas page. I am interested to work in the reinforcement
learning module.

I have compiled mlpack from source and an looking at the code structure of
the reinforcement learning module. I am unable to find any tickets
presently and hoping that someone could direct me as to how to proceed.

I have been interested to use reinforcement learning for equity trading
and  recurrent reinforcement learning algorithms has interested me. I
believe the stock market is a good environment (POMDP) to test and evaluate
the performance of such algorithms as it is a highly challenging setting.
There are so many agents that are involved in the environment and i feel to
develop reinforcement learning algorithms that could trade efficiently in
such a setting will be an interesting problem.Deep learning algorithms like
LSTM, cannot capture the latency involved in the system and hence cannot
make real time predictions. Reinforcement learning algorithms could however
learn how to interact under the latency constraint to make real time
predictions. Some areas that i see work in this area is to:

   - Implement latest work(s) in multi-agent reinforcement learning
   algorithm
   - Implement Recurrent reinforcement learning algorithm(s) that capture
   temporal nature of the environment. Modifications can be made to existing
   work.

I would like to hear suggestions from mentors what they feel about the idea
suggested and if it seems like an acceptable project to suggest for GSOC.

Thanks for your time

Hope to hear from you soon. Feel free to ask for any more details about me
or my work.

Regards,

Rohan Saphal
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://knife.lugatgt.org/pipermail/mlpack/attachments/20180220/8e4d0c42/attachment.html>


More information about the mlpack mailing list