[mlpack] GSoC-2021

Gopi Manohar Tatiraju deathcoderx at gmail.com
Wed Mar 10 01:09:19 EST 2021


Heyy Marcus Edel,

Thanks for your feedback.

When we frame trading as an RL problem on the surface it seems like the
goal of the agent is to *maximize the net worth.* But there are many ways
to reach this goal and there are *different groups of people who work on
different principles. *

Let's compare some:

   - *Day trader: *The goal of any day trader is to maximize his profit but
   also minimize the risk (Trading 101: Always cap your losses). So for this
   use-case, we want to encourage the agent to use something called stop-loss.
   So more reward should be given to trades that are made with stop-loss
   rather than to the trades which are made without stop-loss. This will make
   sure that our agents learn to cover their losses, which is very important
   in a real-world scenario.
   - *Institutional Traders:* These guys consider VWAP(Volume Weighted
   Average Pricing) as the best price on which they can acquire the stocks. So
   regardless of what the current price is these guys always try to buy at
   VWAP only. So for cases like this, we can polarize for not following VWAP,
   thus making it understandable that VWAP is the best price.


Different reward_schemes will be tailored for different use-cases. Based on
how one wants to trade he can choose different reward schemes.

Regarding the first idea, I will soon implement a basic structure and make
a PR, I will also send a detailed mail of what I am planning regarding the
pre-processing tool.

Let me know if you have any more doubts regarding reward_schemes or
anything else.

Thanks,
Gopi

On Wed, Mar 10, 2021 at 5:37 AM Marcus Edel <marcus.edel at fu-berlin.de>
wrote:

> Hello Gopi M. Tatiraju,
>
> thanks for reaching out; I like both ideas, I can see the first idea would
> integrate perfectly into the preprocessing pipeline; that said, it would be
> useful to discuss the project's scope in more detail. Specifically, what
> functionality you like to add, in #2727 you already implemented some
> features, so I'm curious to hear what other features you have in mind.
>
> The RL idea sounds interesting as well, and I think could also fit into the
> RL codebase that is already there. I'm curious what do you mean with
> "rewards schemes"?
>
> Thanks,
> Marcus
>
> On 9. Mar 2021, at 14:55, Gopi Manohar Tatiraju <deathcoderx at gmail.com>
> wrote:
>
> Hello mlpack,
>
> I am Gopi Manohar Tatiraju currently in my final year of Engineering from
> India.
>
> I've been working on mlpack for quite some time now. I've tried to
> contribute and learn from the community. I've received ample support from
> the community which made learning really fun.
>
> Now, as GSoC is back with its 2021 edition, I want to take this
> opportunity to learn from the mentors and contribute to the community.
>
> I am planning to contribute to mlapck under GSoC 2021. Currently, I am
> working on creating a pandas *dataframe-like class* that can be used to
> analyze the datasets in a better way.
>
> Having a class like this would help in working with datasets as ml is not
> only about the model but about data as well.
>
> I have a pr already open for this:
> https://github.com/mlpack/mlpack/pull/2727
>
> I wanted to know if I can work on this in GSoC? As it was not listed on
> the idea page, but I think this would be a start to something useful and
> big.
>
> If this idea doesn't seem workable right now, I want to implement *RL
> Environments for Trading and some working examples for each env*.
>
>
> What all exactly I am planning to implement are the building blocks of any
> RL system:
>
>    - *rewards schemes*
>    - *action schemes*
>    - *env*
>
>
> Fin-Tech is a growing field, and there is a lot of application of Deep-Q
> Learning there.
>
> I am planning to implement different *strategies* like *Bull-Sell-Hold,
> Long only, Short only.*..
> This will make example-repo rich in terms of DRL examples...
> We can even build a small *backtesting module* that can be used to run
> backtest on our predictions.
>
> There are some libraries that are currently working on such models in
> python, we can use it as a *reference* to go forward.
> *FinRL*: https://github.com/AI4Finance-LLC/FinRL-Library
>
> *Planning to implement:*
>
> Different types of *envs* for different kind of financial tasks:
>
>    - single stock trading env
>    - multi stock trading env
>    - portfolio selection env
>
> Some example env in python:
> https://github.com/AI4Finance-LLC/FinRL-Library/tree/master/finrl/env
>
> Different types of *action_schemes*:
>
>
>    - make only long trades
>    - make only short trades
>    - make both long and short
>    - BHS(Buy Hold Sell)
>
> Example action_schemes:
> https://github.com/tensortrade-org/tensortrade/blob/master/tensortrade/env/default/actions.py
>
> We can see class BHS, SimpleOrder, etc.
>
> Different types of *reward_schemes*:
>
>
>    - simple reward
>    - risk-adjusted reward
>    - position based reward
>
>
> For the past 3 months, I've been working as an ML Researcher in a Fin-Tech
> startup and have worked on this only.
>
> I would love to hear your feedback and suggestions.
>
> Regards.
> Gopi M. Tatiraju
>
> _______________________________________________
> mlpack mailing list
> mlpack at lists.mlpack.org
> http://knife.lugatgt.org/cgi-bin/mailman/listinfo/mlpack
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://knife.lugatgt.org/pipermail/mlpack/attachments/20210310/85bb9637/attachment.htm>


More information about the mlpack mailing list