[mlpack] Question for benchmark of Linear/Logistic Regression

Ryan Curtin gth671b at mail.gatech.edu
Thu Jul 17 11:45:18 EDT 2014


On Thu, Jul 17, 2014 at 11:25:04AM -0400, Liu Liu wrote:
> Hi guys,
> 
> I noticed that logistic regression is added to MLPACK recently. I was
> wondering whether you are planning to provide benchmark stats for it, and
> also tutorials.
> 
> Regarding the benchmark result for linear regression, I noticed that it
> fails a lot and runs slower than other packages. Could you please provide
> some insight on why?

Hello Liu,

The mlpack implementation of linear regression is quite simple and
involves inverting a matrix of size (n x n), where n is the number of
points in the dataset.  Unsurprisingly, this fails for large n, which is
exactly the situation where mlpack's implementation performs more
slowly.

An alternate implementation, such as an iterative approach to solving
the system, could provide better results, but honestly in most cases
simple linear regression is not the best technique to use, so this
method doesn't see much attention.

You might consider using LARS, which is a superset of linear regression,
and will perform standard linear regression when both of the l1 and l2
penalty parameters are 0.  With an l2 penalty parameter, it becomes
ridge regression, which is more robust than linear regression.

I hope this is helpful.

Thanks,

Ryan

-- 
Ryan Curtin    | "Lots of respectable people have been hit by
ryan at ratml.org | trains."  - Penny



More information about the mlpack mailing list