[mlpack] Potential Proposal for GSoC 2021

Germán Lancioni gmansoft at hotmail.com
Mon Mar 15 13:44:45 EDT 2021


Hi Anush,

This is a great area to work on. As Omar mentioned, a good scope maximizes and focuses your GSoC effort. If you notice that the available GSoC time is not enough, I would recommend implementing just 1 of the algorithms, e.g. XGB so you can concentrate on the completeness of it instead of stretching your time with 3.

Looking forward to your proposal, very exiting!

Regards,
German

________________________________
From: mlpack <mlpack-bounces at lists.mlpack.org> on behalf of Anush Kini <anushkini at gmail.com>
Sent: Monday, March 15, 2021 09:14 AM
To: Omar Shrit <omar at shrit.me>
Cc: mlpack at lists.mlpack.org <mlpack at lists.mlpack.org>
Subject: Re: [mlpack] Potential Proposal for GSoC 2021

Hi Omar,

Thank you for the inputs.
What you said makes complete sense to me.

I will look towards prioritising algorithm correctness, detailed documentation and tutorials over implementing multiple features.
Additionally, will highlight proof of concept through sample codes and metrics in my proposal.

Thanks & Regards,
Anush Kini

On Mon, Mar 15, 2021 at 3:43 PM Omar Shrit <omar at shrit.me<mailto:omar at shrit.me>> wrote:
Hello Anush,

XGBoost, LightGBM and CatBoost algorithms will be a great addition for
mlpack this year. Since GSoC is shorter, I would concentrate on these
algorithms, with relative tests and examples.

You need to demonstrate in your proposal, that you have a good knowledge
of decision tree algorithms. As always a good starting point is a proof
of concept with relative benchmarks.

These are my suggestions, hope you find this helpful.

Thanks,

Omar

On 03/14, Anush Kini wrote:
> Hi Mlpack team,
>
> I am Anush Kini. My GitHub handle is Abilityguy
> <https://github.com/Abilityguy>.
>
> I have been getting familiar with the code base for the last couple of
> months.
> I am planning to apply for GSoC 2021 and wanted some feedback on my project
> proposal for the same.
>
> I am building on the 'Improve mlpack's tree ensemble support' idea from the
> wiki.
> I would like to implement XGBoost and LightGBM algorithms. If the schedule
> permits, I will look towards implementing CatBoost too.
>
> Additionally, I would like to work on bringing some additional features to
> the ensemble suite:
> 1. I would like to dip into 2619
> <https://github.com/mlpack/mlpack/issues/2619> which aims to implement
> regression support to Random Forests.
> 2. Implementing methods to get the impurity based feature importance
> similar to the one in scikit-learn
> <https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.RandomForestClassifier.html#sklearn.ensemble.RandomForestClassifier.feature_importances_>
> .
>
> Finally, I plan to supplement any new features implemented with tutorials
> in mlpack/examples <https://github.com/mlpack/examples>.
> Looking forward to hearing your opinions and suggestions.
>
> Thanks & Regards,
> Anush Kini

> _______________________________________________
> mlpack mailing list
> mlpack at lists.mlpack.org<mailto:mlpack at lists.mlpack.org>
> http://knife.lugatgt.org/cgi-bin/mailman/listinfo/mlpack

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://knife.lugatgt.org/pipermail/mlpack/attachments/20210315/b7421134/attachment-0001.htm>


More information about the mlpack mailing list