[mlpack] A potential project idea for GSOC 2021

Ryan Curtin ryan at ratml.org
Sun Mar 14 14:49:38 EDT 2021


On Sun, Mar 14, 2021 at 10:19:24PM +0530, RISHABH GARG wrote:
> Hello Marcus and Ryan, I did a bit of research and found a few pitfalls in
> the statsmodels library :-
>     1. The algorithms written in it are in-memory algorithms, so it is
> incapable of handling large datasets.
>     2. It does not have very good documentation.
> 
> We can easily beat it in terms of documentation, but I am not sure about
> the external memory algorithms. Also, I would like to know if the
> algorithms implemented in mlpack are in-memory or external memory?

All mlpack models use Armadillo, which only supports in-memory
computation, but the algorithms themselves are implemented in a generic
way, so with a little bit of work and hacking it is possible to use
external memory for mlpack computations (but I think nobody is really
doing this).

-- 
Ryan Curtin    | "Hey, tell me the truth... are we still in the
ryan at ratml.org | game?" - The Chinese Waiter


More information about the mlpack mailing list