[mlpack] Ruby binding for mlpack

Ryan Curtin ryan at ratml.org
Mon Feb 4 20:53:56 EST 2019


On Fri, Jan 18, 2019 at 08:56:28AM +0000, Shekhar Prasad Rajak wrote:
> Hi Ryan,
> Thanks for sharing the links. I tried to understand the codebase of
> mlpack and bindings in my spare time.

Hi Shekhar,

I'm sorry for the slow response on this one.

> I have gone through the documentations and  tried to Ruby binding
> using `mkmf` ruby gem. I am able to use mlpack methods in Ruby C
> extension. Main points I notice is :

> 1. What will be input and output format for Ruby binding (methods
> should take same options as mlpack cpp program) ? I found that for
> python binding, mlpack is using NumPy and Pandas for data source and
> output. Using Cython passing the data into arma matrix.
> So for Ruby binding it can use daru ruby gem for importing data and
> for data frame.

It's up to you to propose what we should use, but for any language we
should aim to use the most popular linear algebra library that might be
used for data science applications in that language, and it's also
important that we can avoid copying the data matrix when passing the
data to mlpack.  In Python you can see this is done by passing the
memory pointer of the numpy matrix to C++, and then an Armadillo matrix
is wrapped around it.

> 2. Loading data source is fast in mlpack, so it would better to use
> the data loader from ruby code. 

I think we can assume that the user already has their data loaded, and
then they can just call mlpack's Ruby bindings after that.

> I was trying to debug mlpack  Python and command line binding but I
> didn't understand how to do it. let me know if there is any good way
> to use (run the changes) the build directly (or local mlpack source)
> without installing it(means without running cmake and make install)
> into the machine.  

For Python you can do the following, assuming you are in the build
directory:

$ make python
$ export LD_LIBRARY_PATH=lib/
$ export PYTHONPATH=src/mlpack/bindings/python
$ python

then you can do 'import mlpack' and it should work without needing to
install it.  If you compile mlpack with debugging symbols (i.e. cmake
configured with -DDEBUG=ON), then I think you can do 'gdb python' and if
something segfaults or there is some exception, you can get a backtrace.
But honestly in many cases printing information has helped me debug a
lot of the Python problems that I've had during development of the
Python bindings, since running Python with gdb can be very slow.

There may be better tools out there for this but I am not familiar with
them.  Feel free to look around and if you find something better let me
know. :)

-- 
Ryan Curtin    | "Weeee!"
ryan at ratml.org |   - Bobby


More information about the mlpack mailing list