[mlpack] GSoC 2016:Dataset and experimentation tools

Ryan Curtin ryan at ratml.org
Mon Mar 21 11:09:32 EDT 2016


On Fri, Mar 18, 2016 at 09:05:51PM +0300, Даниил Константинович Корбут wrote:
> Hello!
> 
> I would like to make an application which could work with the terminal, as
> in the example, and also a version with user interface. I think there
> should be several csv-table test modes, for example:
> 
>    -
> 
>    interractive, when users will be able to correct the errors when they
>    notice them;
>    -
> 
>    info-mode, which will have information about csv-table problem spots;
>    -
> 
>    the mode which will state in the beginning what skipped fields  should
>    be replaced with, for example, NULL-> 18 in Column Age, Nan-> ”No name” in
>    Column Name. This idea should preferably be developed in GUI-version of
>    application
>    -
> 
>    I think, we might suggest users make backup copies of csv-table before
>    changing and create a folder with the older version of csv-table in the
>    specified directory
>    -
> 
>    of course, there must be documentation to all keys that can be obtained
>    by printing -help on the terminal, as well as video-manuals on using and
>    installing the application.
> 
> 
> I think, it is a great idea to do such a usefull thing. I personally use
> csv-tables in my machine learning course and it would be quite useful for
> me and anyone who deals with machine learning and data analysis.
> 
> If the project is still open, I would like to contribute and discuss it:)

Hi there,

All of the projects are still open---the application process is not over
yet. :)

One issue is that we should support more than just CSV tables.  People
may be using other formats like TSV, ARFF, packed binary, .txt, and so
forth, and mlpack supports those too, so whatever programs we end up
with for this project should ideally support all of those formats.

Thanks,

Ryan

-- 
Ryan Curtin    | "I am."
ryan at ratml.org |   - Joe



More information about the mlpack mailing list