[mlpack] GSoC Idea - New methods (LDA, QDA, and KDA)

Thu Apr 14 00:19:57 EDT 2022

On Wed, Apr 13, 2022 at 02:01:55PM +0530, Suvarsha Chennareddy wrote:
> On Wed, Apr 6, 2022 at 5:31 PM Suvarsha Chennareddy <
> suvarshachennareddy at gmail.com> wrote:
> 
> > Hello everyone,
> > My name is Suvarsha Chennareddy and I’m a fresher (first year) at my
> > university, VIT (Vellore Institute of Technology). I’ve contributed to
> > mlpack a few times over the past few months, and I would like to
> > participate in GSoC. I have an idea, but want approval to begin working on
> > the proposal.
> >
> > I’ve been thinking about adding a new method (or methods) to mlpack as a
> > medium sized project. More specifically, I was wondering if I could work on
> > the implementation for ‘LDA’ (Linear Discriminant Analysis) and maybe even
> > ‘QDA’ (Quadratic Discriminant Analysis) for my summer project.
> >
> > Please let me know if this isn’t going to work out. If it is indeed ok,
> > should I add both or just ‘LDA’? Once I’m sure that this proposed idea is
> > alright, I’ll start working on a proposal. Thank you for taking the time to
> > read this.
> >
> > Thanks,
> > Suvarsha Chennareddy
> >
> 
> Hello everyone,
> 
> I’ve also decided to implement KDA (along with LDA and QDA). The following
> link will direct you to my proposal:
> https://docs.google.com/document/d/1EieZ5lq6BchFHp2F62lJODvi_GFwvoPtjr30fIJlTWM/edit?usp=sharing
> 
> Any feedback would be greatly appreciated.

Hey Suvarsha,

I took a quick look through.  I think it's totally reasonable to propose
3 methods, especially since they are all related.  Everything you wrote
in your proposal seems to be reasonable to me.  I have just two primary
comments and they are minor:

 * It would be useful to implement a binding for each of these methods,
   so users can use them from other languages.  I don't think you will
   find it difficult to write a binding (if you take a look at the one
   for PCA it is quite simple), but it does take some time to write good
   tests for them.

 * In the tests, maybe you are already thinking this but you will
   certainly want more than just one test per method. :)  Many tests
   will seem trivial, like basic sanity checks to make sure that it runs
   without crashing on random data, or that it behaves reasonably with
   empty input, or that it returns data of the desired dimensionality.
   That class of test (where you think as deviously as possible about
   how you might break the interface) is different than the algorithmic
   tests you suggested, but it is equally important.  I promise that
   users are always able to be more devious than developers imagine. :)

I hope this is helpful!

Thanks,

Ryan

-- 
Ryan Curtin    | "If you understood everything I said, you'd be me."
ryan at ratml.org |   - Miles Davis