[mlpack] GSoC query

Sat Mar 17 00:52:26 EDT 2018

Hi Ryan,

I went through a number of research papers and all assured that LMNN's SDP
can be solved with standard SDP solvers but each of them shown the concern
towards its poor performance with large datasets, which I think could be
expected.

I think LRSDP will definitely help us gain some more scalability but can't
say to what extent.

As of now, I will be making the changes regarding BoostMetric in the
proposal on how we can incorporate it with LMNN.  And will soon start
working on LMNN's LRSDP solver as soon as time favors.

Just have a question, Is it necessary to mention the results of LMNN with
SDP solver in proposal? Because I fear I may not get enough time to do the
same within the application period or anytime soon due to a bunch of
curricular and extracurricular activities including several competitions, a
hackathon, upcoming college cultural fest, and above all of them semester
examination, but will try my best to find time among all of this.

Thank you

On 16 March 2018 at 21:58, Manish Kumar <manish887kr at gmail.com> wrote:

> I went through the paper you attached the link to. It mainly describes the
> LMNN gradient based solver. But there I found something that can be a
> little demotivating for SDP approach. Please tell me what do you think?
>
> "The semidefinite program in the previous section grows in complexity with
> the number of training examples(n), the number of target neighbors (k), and
> the dimensionality of the input space (d).  In particular, the objective
> function is optimized with respect to O(kn2) large margin constraints of
> type(a)and(b), while the Mahalanobis distance metric itself is ad×dmatrix.
> Thus, for even moderately large and/or high dimensional data sets, the
> required optimization(though convex) cannot be solved by standard
> off-the-shelf packages (Borchers, 1999)."
>
> https://dl.acm.org/citation.cfm?id=1390302
> Please see page 3, First Paragraph of Solver section.
>
> Sorry, if I am disturbing you a lot.
>
> On 16 March 2018 at 20:19, Ryan Curtin <ryan at ratml.org> wrote:
>
>> On Fri, Mar 16, 2018 at 08:18:10PM +0530, Manish Kumar wrote:
>> > Thanks Ryan for helping me out. I got little anxious by learning the
>> fact
>> > that LMNN+LRSDP may not work and was thinking that my GSoC project may
>> go
>> > futile.
>> >
>> > I do want to continue LMNN+LRSDP. I also see a good opportunity to do
>> > something new while working on LMNN+LRSDP and that's why I never
>> planned to
>> > leave it but was looking for a perfect time for a project like this.
>> >
>> > I will stick to the proposal and will try to achieve the results.
>> >
>> > I just want to propose a little modification to the proposal. Tell me,
>> if
>> > that sounds fair.
>> > I went through BoostMetric literature and find it as a significant
>> > improvement over LMNN. So, can we at least include BoostMetric as a
>> part of
>> > the project apart from LMNN. I am sure that it will be a pretty good
>> > addition and currently except author's implementation there is no other
>> > implementation out there. Moreover, existing implementation takes the
>> > exponential loss into consideration and we may include implementation
>> based
>> > on logistic loss function which shows improvement over the exponential
>> one.
>> > If that's okay, I will make some small changes to timeline and proposal.
>> > This way we could have some pretty awesome metric learning algorithms.
>>
>> That sounds reasonable to me.  I think that BoostMetric is a
>> generalization of LMNN, so maybe through some clever templatization you
>> can support the other components.  I haven't read the paper in detail
>> though so I am not fully sure.
>>
>> --
>> Ryan Curtin    | "Do they hurt?"
>> ryan at ratml.org |   - Jessica 6
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://knife.lugatgt.org/pipermail/mlpack/attachments/20180317/ae602aa2/attachment.html>