[mlpack] Question for benchmark of Linear/Logistic Regression

Smith, Dale (Norcross) Dale.Smith at fiserv.com
Thu Jul 17 15:28:02 EDT 2014


Hello,

LARS has advantages, but also a potentially major disadvantage, which is addressed in the discussion following the original LARS paper in the Annals of Statistics. You should probably read up on this before deciding on this approach, particularly with high dimensional data.

I'm using ridge regression and the AICc to choose the ridge parameter. The LASSO may be more appropriate for your application.

Dale

-----Original Message-----
From: mlpack-bounces at cc.gatech.edu [mailto:mlpack-bounces at cc.gatech.edu] On Behalf Of Ryan Curtin
Sent: Thursday, July 17, 2014 11:45 AM
To: Liu Liu
Cc: mlpack at cc.gatech.edu
Subject: Re: [mlpack] Question for benchmark of Linear/Logistic Regression

On Thu, Jul 17, 2014 at 11:25:04AM -0400, Liu Liu wrote:
> Hi guys,
> 
> I noticed that logistic regression is added to MLPACK recently. I was 
> wondering whether you are planning to provide benchmark stats for it, 
> and also tutorials.
> 
> Regarding the benchmark result for linear regression, I noticed that 
> it fails a lot and runs slower than other packages. Could you please 
> provide some insight on why?

Hello Liu,

The mlpack implementation of linear regression is quite simple and involves inverting a matrix of size (n x n), where n is the number of points in the dataset.  Unsurprisingly, this fails for large n, which is exactly the situation where mlpack's implementation performs more slowly.

An alternate implementation, such as an iterative approach to solving the system, could provide better results, but honestly in most cases simple linear regression is not the best technique to use, so this method doesn't see much attention.

You might consider using LARS, which is a superset of linear regression, and will perform standard linear regression when both of the l1 and l2 penalty parameters are 0.  With an l2 penalty parameter, it becomes ridge regression, which is more robust than linear regression.

I hope this is helpful.

Thanks,

Ryan

-- 
Ryan Curtin    | "Lots of respectable people have been hit by
ryan at ratml.org | trains."  - Penny
_______________________________________________
mlpack mailing list
mlpack at cc.gatech.edu
https://mailman.cc.gatech.edu/mailman/listinfo/mlpack



More information about the mlpack mailing list