[mlpack] GSoC query

Fri Mar 16 10:20:23 EDT 2018

On Fri, Mar 16, 2018 at 12:52:00PM +0530, Manish Kumar wrote:
> Hello,
> I am Manish Kumar (IRC: manish7294). After yesterday's discussion on IRC, I
> went looking for the other optimal reliable options related to metric
> learning. And have picked out some relatively comparable alternatives to
> LMNN by going through literature in detail.
> 
> 1. BoostMetric
> <https://pdfs.semanticscholar.org/65af/3d9b9424cebb1054aac6f71bf2e39a3b1994.pdf>(It
> belongs to the category of supervised learning. It takes LMNN background as
> its basis and tries to improve it by incorporating alternative exponential
> loss function and a different optimization technique. Overall it combines
> the characteristics of boosting and metric learning and has claimed to
> outperform LMNN. See page 7 of literature for the results.)
> 
> 2. ITML <http://www.cs.utexas.edu/users/pjain/pubs/metriclearning_icml.pdf>
> (This one belongs to unsupervised-category and requires the external
> knowledge of similar and dissimilar data points which acts as constraints.
> Though constraints can be generated on the basis of labels, subsequently
> shifting ITML to the supervised category. This one has shown results
> comparable to LMNN as well.)
> 
> After discussions, I realized that it will not be a good idea to propose
> something that doesn't ensure to work at the end. So, for the time being, I
> have decided to put LMNN with LRSDP at the halt and continue it from the
> same point in near future as a commendable test project.
> 
> At this point, I may need to re-design my proposal. So, I humbly request
> you to give your feedback on my thought. I intend to include the
> implementation of these two state-of-art algorithms, if favorable. I expect
> them to give a solid boost to metric learning algorithms.

Hi Manish,

I didn't mean to imply that LMNN might not be useful.  We already have
an SDP solver so it should be possible to at least implement LMNN to use
the existing SDP solver and I think it is expected that that will work
just fine.  So I don't think your proposal needs to be redefined, but I
do think maybe a good "first step" is to get LMNN working with mlpack's
PrimalDualSolver for SDPs (found in src/mlpack/core/optimizers/sdp/).
Then the rest of the project can be making it work with LRSDPs.  This
way, if we do have success with LRSDPs, I think that we might have some
interesting results that could be published at some workshop.  And if
there is no success with LRSDPs, it is not a huge issue since we already
have LMNN working with the regular SDP solver.

Here's another paper discussing acceleration of LMNN, but I think that
focuses on accelerating the nearest neighbor search step, not the actual
solution of the SDP:

https://dl.acm.org/citation.cfm?id=1390302

If you'd rather work on BoostMetric and ITML, please feel free to adjust
your proposal for that, but I definitely don't want you to get the
impression that LMNN+LRSDP is not a good project---I think it is just
fine, assuming that we can at least have an implementation that will
work with a regular SDP solver for the case where we can't get the LRSDP
to converge.

I hope this helps clarify.

Thanks,

Ryan

-- 
Ryan Curtin    | "That rug really tied the room together."
ryan at ratml.org |   - The Dude