[mlpack] GSoC 2018 MVU+LRSDP Bugfix

Wed Mar 21 10:41:17 EDT 2018

On Tue, Mar 20, 2018 at 03:22:04AM +0000, Abhijeet Krishnan wrote:
> Hi,
> 
> I am interested in working for mlpack for GSoC 2018, with an interest in
> trying to get the MVU + LRSDP implementation working. I have built mlpack
> and modified it (with help from zoq) to also build the MVU part of the
> code. I am currently attempting to re-write the MVU code to match the
> current APIs, since there were a number of compilation errors which I
> believe were due to the LRSDP APIs being updated without also updating the
> MVU code.
> 
> In terms of debugging the error, I have thought of the following approach -
> 1. Test the MVU code using the primal dual optimizer to verify if the error
> is in MVU or in LRSDP
> 2a. If MVU+Primal Dual converges, the error is most likely in LRSDP
> 2b. If MVU+Primal Dual also does not converge, the error is most likely in
> MVU (might also be in Primal Dual)
> 3. Wherever the error is, my plan is to be clear about how that algorithm
> works (from reading the research papers) and to test the values the
> function returns against some working implementation of it in another
> package
> 4. If the error is in LRSDP, I don't think there is a reference
> implementation. I will need to really dig into the code and algorithm to
> find the error. I guess this is exactly why the issue is marked 10/10 in
> difficulty.
> 
> I would like to know what approaches and efforts have been made previously
> to solve this problem, in particular, if anyone has any intuition as to
> where the problem might lie and what would be a good approach to solve it.
> 
> If anyone has any tips to share on debugging numerical algorithms, that
> would also be welcome.
> 
> I am currently pursuing a PhD in CS from NC State University with plans of
> conducting research in Generative Methods.

Hi Abhijeet,

Thanks for getting in touch.  This is a good start for a plan but I want
to emphasize the difficulty here.  A good knowledge of nonconvex
optimization and semidefinite programs will be involved, and in a good
proposal I think it would be important to include some number of
strategies for debugging optimizers that aren't converging.

There is an additional paper or tech note by Sam Burer (I can't remember
the title) that points out that saddle points do exist in LRSDP, so it
is possible (although I don't think this is the case) that LRSDP is
getting stuck in saddle points and not converging for MVU.

In addition, it's likely to be worthwhile to confirm that the MVU
solution matrix can even be considered to be low-rank---if it can't,
then it's not reasonable to ever expect the algorithm to converge to a
good solution.

I hope these pointers are helpful.

Thanks,

Ryan

-- 
Ryan Curtin    | "You know, I think he's got a point."
ryan at ratml.org |   - The Mayor