[mlpack] Collaborative filtering - GSOC project idea

Ryan Curtin ryan at ratml.org
Wed Feb 21 10:44:06 EST 2018


On Wed, Feb 21, 2018 at 07:51:33PM +0530, Haritha Sreedharan Nair wrote:
> Hi Ryan,
> 
> It has been great contributing to mlpack and it is nice to see the amount
> of activity in here after the GSOC announcement :D .

Yeah, I think mlpack is a popular GSoC choice.  It is a lot of emails to
keep up with... :)

> I would like to clarify something. Regarding NCF, was your suggestion to
> try some standard datasets on both NCF(python implementation available) and
> other standard algorithms to compare the stats? Or were you just pointing
> at the need of benchmarking it after implementation? It is something I plan
> to make a part of my proposal, so any suggestion or direction would be
> great.

So, maybe both is the right answer here.  I would say that before we
write NCF for mlpack, we want to know that it can perform well.  You
could show this simply by benchmarking existing implementations and
comparing the results.  If it does perform well (in terms of RMSE or
related metrics on a number of datasets), I'd say that's probably
sufficient for your application (unless you want to do more).  During
the project, were it to be selected, we would need to ensure that the
implementation in mlpack was (a) able to achieve the same performance
metrics (RMSE, etc.) as other implementations and (b) runs faster than
other implementations.  (Otherwise, there is not much reason to add it.)

> I will go on and open a PR for having QUIC-SVD and randomized-SVD as MF
> methods in CF and will also work on implementing the few missing concepts
> from the original paper in the coming days. I guess that will give a good
> understanding of the existing model. :)

That sounds good; I will try and review it when I have a chance.

Thanks!

Ryan

-- 
Ryan Curtin    | "Where we're going, we won't need eyes to see."
ryan at ratml.org |   - Dr. Weir


More information about the mlpack mailing list