[mlpack] GSoC'22 Project Proposal: Better Stack Layering & Ready-to-Use Model

Tue Apr 12 23:49:36 EDT 2022

Hello Ryan,

I am thinking of designing two proposals here so that one can be picked
easily.  If I do DAG, then I will implement the first four models only, and
if I do models only, I will do all 6 of them.

And thanks for the suggestion. I will send out a google doc containing both
proposals.

Regards
Shubham Agrawal

On Tue, Apr 12, 2022 at 8:36 PM Ryan Curtin <ryan at ratml.org> wrote:

> On Thu, Apr 07, 2022 at 05:44:17PM +0530, Shubham Agrawal wrote:
> > Sorry for the late reply.
> >
> > My thought of representing DAG as an adjacency list approach. Storing
> > pointers to the next and previous layers is required for backward and
> > forward passes in the layer itself. That's why I am trying to use 2
> utility
> > layers to handle start and end points. I think I have provided some
> > pseudocode for these passes. But I haven't thought about anything too
> > specific for now. We can also set up a meeting to discuss this.
>
> Do you mean that you plan to modify the Layer class?  That shouldn't be
> necessary.  You should instead just need to hold the adjacency list in
> the class that holds all the layers.  No modification should be needed
> to any of the layers themselves.
>
> > About the models list, I have selected some candidates models.
> > 1. AlexNet -
> >
> https://proceedings.neurips.cc/paper/2012/file/c399862d3b9d6b76c8436e924a68c45b-Paper.pdf
> > 2. SqueezeNet - https://arxiv.org/pdf/1602.07360.pdf
> > 3. VGG 11, 13, 16, 19 - https://arxiv.org/pdf/1409.1556.pdf
> > 4. Xception - https://arxiv.org/pdf/1610.02357.pdf
> > 5. PolyNet - https://arxiv.org/pdf/1611.05725.pdf
> > 6. NASNet - https://arxiv.org/pdf/1707.07012.pdf
> > About NASNet and PolyNet, they can't be retrained for now on mlpack
> because
> > of missing GPU support, and they take time on GPU for training.
>
> Sorry that I don't have the context, but if your plan is to implement
> all six of these as well as the DAG network in one project, that's
> great---but do be aware that you may spend more time than you expect
> debugging memory handling of the DAG network implementation.  It's
> important that we avoid data copies, so some amount of time should go
> into that.
>
> You can take a look at, e.g., the implementations of the memory handling
> functions in MultiLayer (in #2777); there is one function to allocate
> memory for each of the forward/backward/gradient passes.  Maybe you have
> already seen that, but in any case, the complexity of that will be a lot
> more for the DAG case. :)
>
> I hope this is helpful!
>
> Thanks,
>
> Ryan
>
> --
> Ryan Curtin    | "Give a man a gun and he thinks he's Superman.
> ryan at ratml.org | Give him two and he thinks he's God."  - Pang
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://knife.lugatgt.org/pipermail/mlpack/attachments/20220413/89bc96dc/attachment.htm>