[mlpack] Regarding GSOC Proposal for ANN Algorithms Implemented in mlpack

Thu Mar 12 14:44:10 EDT 2020

Hello Marcus,

Thank you for your encouraging response, I think testing, proper documentation and a tutorial will be very important. I think testing for all layers and metrics that I add in mlpack, shouldn't be very hard. Hopefully by then models repo would have been re-structured to support tests, tutorials and ready to deploy models. For object localization, we could implement the following tests:

1. Load weights and run for a few epochs (same as other models).
2. Take random images from validation dataset and set a minimum IoU (such that test don't fail yet show that the models is working fine.)
3. Run classification accuracy test that I am currently working on for object detection models in the repo as part of restructuring.

I think the most important part would be API (to increase flexibility especially for a model such as YOLO), proper documentation and tutorial. A user should be able to link a video or image and we directly save the frames / videos with bounding boxes. CLI would be very useful here.
For documentation I think I will add a ReadMe in each folder of models repo to describe uses, parameters, function call. Tutorials might be the ones that would require some time because they need to be simple enough so that a user can understand them without understanding all of the underlying code.

The differences between the models are minor so I think supporting other models
should be straightforward.

And Yes I agree, If we add some more layers and some residual blocks to Darknet 19 we can get Darknet 53, so I can the same thing that I did with LeNet (v1, v4 and v5). So, We can have a Darknet class add according to version we can return the layers and later add an alias to the version so that we can call DarkNet19 and DarkNet53.

Yeah, time-consuming indeed, but possible, we have a bunch of machines we could
provide for training models.

That would be really great but I think I think loading weights of Darknet to YOLO makes sense so that at least that portion of the model doesn't have to change much in terms of weights.So it would overall take lesser time, and we are already training Darknet so we can easily use those weights.Later on, when bindings for other languages are added I think this will prove to be a very useful model especially in devices like Raspberry-Pi.
I think after I implement this, I would run inference on real-time videos on RPi3 and we could add those results in the readme as well. I would love to hear you opinions on the same to improve the proposal more. Thanks a lot.

Regards,
Kartik Dutt,
Github id: karikdutt18

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://knife.lugatgt.org/pipermail/mlpack/attachments/20200312/629019d1/attachment.htm>