[mlpack] r^2 for linear regression

Ryan Curtin ryan at ratml.org
Thu May 5 17:45:12 EDT 2022


On Thu, May 05, 2022 at 10:26:37AM +0000, Michael Bane wrote:
> Hi all, I've just (happily!) discovered mlpack so please excuse any
> newbie questions. I had a quick look in the docs but couldn't find
> what i am looking for. NB I'm a C programmer trying to get by with C++
> but really starting off with the CLI to see how things work and to do
> initial data exploration, with the C++ coding to follow.

Hey there!  Thanks for getting in touch.  I'm glad you are finding
mlpack useful so far. :)

> Specifically, I am looking at the linear regression and have CLI
> working fine for my input matrix (X,Y), where X has ~20 cols and there
> are ~4000 rows) and can obtain a model and also predict new Y' for new
> X'. This data enables me to use gnuplot to draw a graph such as below.
> However, what I am looking for is an "r squared" value to quantify how
> good the fit is. I have seen there is a member function "computeError"
> but how would I access this from the command line?

>From the command line, that support isn't available---basically, mlpack
has a C++ library component that has detailed and flexible
functionality, and then also a "bindings" component (which includes the
CLI programs you are using) that provide limited, but consistent
support.

So for computing R^2, you would need to use ComputeError() or perhaps
R2Score from <mlpack/core/cv/metrics/r2_score.hpp>.  It would also be a
reasonable thing to implement support for that in the CLI binding; if
you wanted, you could open an issue for it on Github (or a PR if you
want to go ahead and make the changes!).

> NB I'm not sure why I get a warning with this invocation - it seems to
> follow the documentation and it does actually produce
> "newPredicts.csv" with the Y' values:
> mkb at hec003-ssd ~/EEC/mlpack-examples$ mlpack_linear_regression -v --training_file newTrain.csv --test_file newTest.csv  -o newPredicts.csv
> 
> [WARN ] '--output_predictions_file (-o)' ignored because '--test_file (-T)' is specified!

Oh, thanks for pointing this out!  It was a tiny error in the binding.
The warning was incorrect.  A PR with the fix is open now:
https://github.com/mlpack/mlpack/pull/3204

Anyway, hopefully the responses are helpful; let me know if there's
anything I can clarify.

Thanks!

Ryan

-- 
Ryan Curtin    | "And the last thing I would ever do is lie to you."
ryan at ratml.org |   - Marlon


More information about the mlpack mailing list