[mlpack] Collaborative filtering - GSOC project idea

Haritha Sreedharan Nair haritha1313 at gmail.com
Wed Feb 21 09:21:33 EST 2018


Hi Ryan,

It has been great contributing to mlpack and it is nice to see the amount
of activity in here after the GSOC announcement :D .

I would like to clarify something. Regarding NCF, was your suggestion to
try some standard datasets on both NCF(python implementation available) and
other standard algorithms to compare the stats? Or were you just pointing
at the need of benchmarking it after implementation? It is something I plan
to make a part of my proposal, so any suggestion or direction would be
great.

I will go on and open a PR for having QUIC-SVD and randomized-SVD as MF
methods in CF and will also work on implementing the few missing concepts
from the original paper in the coming days. I guess that will give a good
understanding of the existing model. :)

Thank you.

On Tue, Feb 20, 2018 at 6:46 PM, Ryan Curtin <ryan at ratml.org> wrote:

> On Sat, Feb 17, 2018 at 11:48:57AM +0530, Haritha Sreedharan Nair wrote:
> > Hi,
> > I am Haritha Sreedharan Nair (github username- haritha1313).
> > I would like to work on implementing better collaborative filtering
> models
> > in mlpack as part of GSOC. I had worked on recommendation system based
> > projects earlier and I see a lot of scope in mlpack's CF implementation.
> >
> > As of now I have been through the research paper cited in
> mlpack/methods/cf
> > and realized that some concepts mentioned in the paper (the ones
> explaining
> > how to handle biases etc.) haven't been implemented yet. I have also
> > explored a few other research papers and articles including the one
> > mentioned in mlpack's ideas wiki.
> >
> > I found this (https://www.comp.nus.edu.sg/~xiangnan/papers/ncf.pdf)
> paper
> > interesting - it gives importance to implicit feedback, performs better
> > than existing methods and it is able to generalize the matrix
> factorization
> > methods we use in mlpack too. Since it is a pretty new research paper I
> am
> > not able to find any discussions on it and would like to know if the
> > maintainers find it to be worth implementing.
> >
> > I would also like to clarify a doubt. Is there any reason why quic_svd
> and
> > randomized_svd have not been used for matrix factorization in CF?
>
> Hi Haritha,
>
> Thanks for the nice contributions over the past weeks.  The NCF paper
> has definitely got some amount of attention over the past year and may
> be worth implementing.  But if we do implement it, we should be sure to
> compare it with existing techniques.
>
> You are right that some of the concepts in the papers cited in the CF
> code aren't implemented.  It might be nice to add these, but there is
> not currently an open issue for it.
>
> Lastly, there is not a good reason that QUIC-SVD and randomized SVD
> haven't been used for CF.  If you would like to open a PR please feel
> free. :)
>
> I hope this is helpful; let me know if I can clarify anything.
>
> Thanks!
>
> Ryan
>
> --
> Ryan Curtin    | "How long have you had these things on?"
> ryan at ratml.org | "Sixty-two years."
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://knife.lugatgt.org/pipermail/mlpack/attachments/20180221/52f03880/attachment.html>


More information about the mlpack mailing list