[mlpack] mlpack's 2021 GSoC projects

Ryan Curtin ryan at ratml.org
Fri May 28 16:16:12 EDT 2021


Hello everyone!

In just about two weeks, this summer's coding for Google Summer of Code
will start.  This year we have 8 projects, and I'm excited about each of
them.  I wanted to provide some details about each of these projects, to
give an idea of what will be happening this summer. :)

------

"Revamp mlpack bindings", by Nippun Sharma
 mentored by Ryan Curtin, James Balamuta, and Yashwant Singh

Now that mlpack not only provides a command-line interface but also
an interface in Python, Julia, Go, and R, it has become a necessity to
remove the single function interface that mlpack's bindings currently
provide and use a more modern interface with which the user is more
familiar.  For Python, this means that each of mlpack's algorithms will
be wrapped in a class that provides a scikit-like interface.  Nippun is
from Delhi, a big fan of music (he plays the drums) and likes driving.

------

"Improve tree ensemble support", by Rishabh Garg
 mentored by Ryan Curtin and German Lancioni

Tree ensembles are arguably the best class of machine learning
algorithms out there.  They regularly win the competitive data science
competitions.  This project aims to implement the XGBoost algorithm to
improve mlpack's tree ensemble support.  Rishabh attends IIT Mandi, in
India, and is very interested in both AI and cryptography.  He also
plays piano.

------

"Ready to Use Models in mlpack", by Aakash Kauhsik
 mentored by Kartik Dutt and Marcus Edel

Aakash will implement MobileNetV1, which will also include implementing
depthwise separable convolutions and a ResNet model builder that can be
used to create ResNet18, ResNet34, ResNet50, ResNet101, and ResNet152.
The ResNet builder and MobileNetV1 will fall into the models repository,
and depthwise separable convolutions will fall into the mlpack
repository as a layer inside the neural network codebase.  Aakash is a
student at the SRM Institute of Science and Technology in Chennai,
India, and loves random conversations about all matter of things.

------

"Replacing boost::spirit", by Gopi Manohar Tatiraju
 mentored by Omar Shrit

This project revolves around reducing the binary footprint of mlpack by
replacing the functionality of boost::spirit to handle a more diverse
range of data in mlpack.  This project is a part of the bigger goal of
removing boost dependencies.  Gopi will reimplement mlpack's custom CSV
parser that is currently being utilized to handle non-numeric data by
adapting Armadillo's internal CSV parser to handle non-numeric data.
Gopi studies at the Mukesh Patel School of Technology Management and
Engineering in India, and has spent some time recently picking up
finance and trading, including cryptocurrencies and NFTs.

------

"Example Zoo", by Roshan Swain
 mentored by Kartik Dutt and Marcus Edel

Example Zoo is an implementation of mlpack showcasing its potential
usage in the real-world domain.  It will provide a better starting point
for new users to learn from ready-to-run code.  It will showcase the
usage of the API, how it can be integrated with different visualization
libraries for cool graphs and plots.  Roshan is majoring in electrical
engineering at the National Institute of Technology, Agartala, in India.
He's planning to learn the violin when he's able.

------

"Example Zoo", by David Port Louis
 mentored by Kartik Dutt and Marcus Edel

Example Zoo provides starters a gilmpse of most if not all features
provided by mlpack, such that users can run the code for themselves to
see it in action, maybe change things, break it, and figure out how to
fix it.  This would enable a starter to become familiar with the
library relatively faster than reading documentation and starting from
scratch.  David is in Puducherry, India and likes gardening and cycling.

------

"Improvisation and Implementation of ANN Modules", by Abhinav Anand
 mentored by Marcus Edel

Abhinav will be adding new layers (Upsample, Group Normalization, and
ChannelShuffle) to the ANN module of mlpack.  He will also improve the
speed of pooling operations of max, mean, and LP pooling layers, and
also will improve the speed of the un-pooling operation of mean pooling
layers.  Abhinav just graduated last month, and will soon start working.
He is a chess expert and plays for at least half an hour every day.

------

"A Framework for Multiobjective Optimizers", by Nanubala Gnana Sai
 mentored by James Balamuta, Sayan Goswami, and Marcus Edel

The ensmallen library boasts an extensive set of objective optimizers,
almost all of which focus on single-objective problems.  Previous works
by Sayan Goswami paved the way for multiobjective optimizers.  This was
further complemented by the addition of Schaffer-N1 and Fonseca-Flemming
test suites.  This project will add optimizers, extend the test
framework, and make ensmallen more accepting of multi-objective problems
in general.  Sai has been contributing to C++ machine learning projects
for quite a while now, and previously was a part of the Shogun ML
effort.  He also likes to play piano, and is a fan of history.

------

If you want more information about each project, the Summer of Code
website has more:

https://summerofcode.withgoogle.com/projects/?sp-search=mlpack

Anyway, these 8 great projects will get started soon!  (Actually, some
have already started. :))  All the students are available in the chat
channels, so if you want to get to know them or have any questions about
the projects, feel free to ask!

Thanks, and have a great weekend!

Ryan

-- 
Ryan Curtin    | "Chappie is in the car!"
ryan at ratml.org |  - Chappie


More information about the mlpack mailing list