[mlpack] NumFOCUS summit report and ideas

Germán Lancioni gmansoft at hotmail.com
Thu Nov 7 13:11:40 EST 2019


Hi folks,

Thanks Ryan for sharing this. I agree with several ideas. A couple of notes:

1) Moving from IRC may be a good idea - I personally can't use it, let's say because of "1990's environment restrictions".
2) mlpack talks/workshops - this is a very good point. I've been doing some of this and it really pays off. The amount of user conversion you get by doing this is superior to opportunistic discoveries of the library. And I truly believe we need more users and more engagement. After all, that's why we build mlpack, right? I would add that we also need talks and workshops that are *not* super technical. For new users (consumers of the library I mean), metaprogramming is not really attractive. Instead, these users are looking forward a simple-useful-easy "Hello World" that delivers instantaneous value. Take a look at scikit-learn for instance, it's all about examples and ready to consume snippets.
3) Website - This is our presentation card. That's why I hope the new website we are finishing will give us the status mlpack deserves. Again, better website and documentation brings more users. More users means also more contribution and more grow. That is the final satisfaction. Looking forward to working more on this.

Regards,
German

________________________________
From: mlpack <mlpack-bounces at lists.mlpack.org> on behalf of Ryan Curtin <ryan at ratml.org>
Sent: Wednesday, November 6, 2019 07:44 PM
To: mlpack at lists.mlpack.org <mlpack at lists.mlpack.org>
Subject: [mlpack] NumFOCUS summit report and ideas

Hey everyone,

This past weekend Marcus and I attended the NumFOCUS summit
(https://numfocus.org/summit2019) in New York City, since we are now a
part of NumFOCUS. :)  I introduced mlpack to the rest of the NumFOCUS
community with these slides:

https://www.ratml.org/misc/mlpack-numfocus.pdf

The summit was a great time, and I hope in future years that more of us
will be able to attend.  I would say we are still figuring out the
NumFOCUS thing at the moment.  They are too---NumFOCUS still has some
other legwork to add until everything is set up and mlpack is shown on
their website.

There were quite a lot of discussions over the weekend that I had, and I
thought it would be useful to bring some of the ideas here for further
discussion.  So, without further ado, I've split each thing into its own
section.  If you're interested, please read on, and if you have any
opinions or thoughts, please respond!

-----

  mlpack video meetings

I talked to a lot of people about how they used video chat and meetings.
We've had meetings in the past and I think they've been successful, but
one of the reasons they haven't happened as much is that I've prepared
slides and other materials in the past, so it's high overhead, and I
don't always have the time to do that regularly.

One idea I heard in discussions sounded pretty interesting to me: the
AstroPy community has weekly calls, but they don't have a particular
agenda and it's quite informal.  What I was told is that they all just
jump on a call at a certain time each week, and whoever is there
discusses PRs or directions or other ideas, or, alternately, people just
write code while together in the same video chat room. :)

I think this is an interesting idea, and if others think so too, maybe
we can try it!

-----

  mlpack chat / IRC

Both at the GSoC mentor summit and the NumFOCUS summit, I found that
lots of communities use lots of different tools.  There's a huge
proliferation of different chat services... we are maybe a bit
antiquated with just IRC.  I've thought that for a while, but someone
showed me Matrix (https://matrix.org) and I was quite impressed: Matrix
has support for bridges to all kinds of different services.

So, as an experiment, I've connected Matrix to the mlpack IRC room.  If
you join the Matrix room #mlpack:matrix.org, any messages you send there
will get relayed to IRC.  There is also Gitter and Slack integration,
which I think could be really useful---a lot of people in the past have
asked about other chat services, but we haven't had anything to offer
since in part my aim has been to keep things as low overhead as possible
(and let's face it: with my plaintext emails and command-line mentality,
I like to believe it's still the 90s...).

If you think this is interesting or a good idea, speak up, and maybe we
can make the integrations better and then update the website to point
people towards mlpack's chat services in a nicer way.

-----

  giving talks on mlpack

NumFOCUS runs the PyData conference, and I learned that PyData is not
just about Python libraries.  So, it could be really nice to give talks
at PyData about why C++ machine learning is possible, what things make
it exciting (speed / template metaprogramming are my usual gotos), and
how it can be done.  If you're interested in giving a talk on mlpack,
NumFOCUS may be able to help, and I can definitely help with
materials---I've prepared a lot over the years.

There are *lots* of opportunities to speak about what we're doing here,
and I personally hadn't realized how many opportunities there are.

-----

  C++ notebooks

I spent some time talking with the CEO of QuantStack
(https://quantstack.net), who showed me a working C++ Jupyter notebook
environment:

https://github.com/QuantStack/xeus-cling

It's built on top of the `cling` compiler, which is a modified version
of the clang compiler from the ROOT project that can behave as a REPL.

Maybe people have known about this for a while, but this was the first I
heard of it, and it's very exciting to me: in the past, a clear niche
for mlpack was for production/deployment of models, and not so much
interactive experimentation.  But xeus-cling seems like it will make it
possible.

Personally, I'm going to be playing with xeus-cling and making some
mlpack notebooks for it.  I'll share the efforts as they come along.
QuantStack was really interested in being able to showcase a machine
learning application inside of the xeus-cling notebooks, so there is a
good amount of momentum and collaboration opportunity here.

(Just an aside: the xeus-cling notebooks currently suffer from problems
whenever you try to redefine a variable---which happens a lot and is
really inconvenient.  But the ROOT project released a new version of
cling that has a workaround for this, so the xeus-cling maintainers just
have to repackage that.  It may take a while, but that's the biggest
showstopper that I saw for mlpack C++ notebooks, and if that gets fixed,
I think we can have something really nice for people to interactively
play with and use!)

-----

  logo design and website, etc.

When we came up with ensmallen, we did our best to find a logo, but it's
a little bit generic.  I got connected to some graphic designers who
might be able to help.  I have only very primitive logo design skills
so maybe we can come up with something better here.  I'm not sure how it
will work out, but I figured it was worth reporting. :)

-----

  mlpack/Shogun joint workshop

Each year, NumFOCUS helps Shogun put on a workshop, where their
developers get together in one place.  I think this could be really
great for mlpack, and maybe we could colocate a workshop like this and
find some ways to collaborate between the two projects or something.

I'll probably send out a survey in a while to gauge interest in the
idea, but of course, there would be a lot of planning to do and nothing
is firm yet.  At this point we can consider it just a proposal, but I
will say I think it would be super cool to get our development community
in the same place!  For now we just run into each other occasionally at
conferences and such.

-----

Okay, I know that I've written a lot!  It was a very informative weekend
and I'm quite excited about the things that will come out of it.  If
anything that you read above interests you, please feel free to respond!
Like I said in the talk, mlpack is community-led, so, we all get to
decide as part of a group what it is we think is cool and how we want to
do it. :)

(And, if we disagree on direction, that's perfectly okay!  As part of
our joining process, NumFOCUS asked us to come up with a governance
policy to help guide how we make decisions, and we have a draft:
https://github.com/mlpack/mlpack/pull/2068 )

--
Ryan Curtin    | "You got to stick with your principles."
ryan at ratml.org |   - Harry Waters
_______________________________________________
mlpack mailing list
mlpack at lists.mlpack.org
http://knife.lugatgt.org/cgi-bin/mailman/listinfo/mlpack
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://knife.lugatgt.org/pipermail/mlpack/attachments/20191107/936c6c75/attachment.html>


More information about the mlpack mailing list