[mlpack] GSoC-2021

Omar Shrit omar at shrit.me
Mon Mar 29 10:58:52 EDT 2021


Hey Gopi

On 03/29, Gopi Manohar Tatiraju wrote:
> Hey,
> 
> I agree, after going a bit through both the candidates I can see we can
> unload a lot of work by using a well-implemented existing parser.
> I think I should start by comparing both the mentioned libraries to decide
> which one to use. I will use the same benchmark strategy that
> was discussed in the issue. Does that sound good?

Sounds good to me.
 
> And also I think I can work on replacing boost spirits in GSoC then. This
> will be a start to the data frame idea. Even if we are left with time
> after this, I can start the work on the data frame as well. Is it
> considerable?

Yes of course.

> Thanks,
> Gopi
>
> 
> On Mon, Mar 29, 2021 at 7:33 PM Omar Shrit <omar at shrit.me> wrote:
> 
> > Hey Gopi,
> >
> > I totally agree with Ryan, using existing parser will accelerate the
> > project and allow to move forward with the dataframe class. Also, I
> > do believe that replacing boost Spirit with an existing parser will take
> > a considerable amount of the summer.
> >
> > Thanks,
> >
> > Omar
> >
> > On 03/29, Ryan Curtin wrote:
> > > On Mon, Mar 29, 2021 at 04:17:35PM +0530, Gopi Manohar Tatiraju wrote:
> > > > Would love to hear your thoughts on whether to go with an already
> > > > implemented parser or build a new one. Also if we are planning to
> > build a
> > > > data frame here then
> > > > maybe going with an in-house parser would be better as we will have the
> > > > ability to design it in such a way that it can extend maximum support
> > to
> > > > the new data frame
> > > > which we are planning to build ahead.
> > >
> > > Hey Gopi,
> > >
> > > Honestly I think it's best to use another package.  Not only will this
> > > free up time to actually work on the dataframe class, but also it means
> > > we are not responsible for maintenance of the CSV parser.  There are
> > > lots of little complexities and edge cases in parsing (not to mention
> > > efficiency!) and so we can probably get a lot more bang for our buck
> > > here by using an implementation from someone who has already put down
> > > the time to consider all those details.
> > >
> > > Hope this is helpful. :)
> > >
> > > Thanks,
> > >
> > > Ryan
> > >
> > > --
> > > Ryan Curtin    | "Kill them, Machine... kill them all."
> > > ryan at ratml.org |   - Dino Velvet
> >
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: not available
URL: <http://knife.lugatgt.org/pipermail/mlpack/attachments/20210329/98856cf4/attachment.sig>


More information about the mlpack mailing list