[mlpack] Regarding GSOC Proposal for ANN Algorithms Implemented in mlpack

Tue Mar 10 14:14:07 EDT 2020

Hello Marcus,
  Thank you for the reply, it encouraged me to come up with a better project proposal. Also thanks a lot for the all help and the code reviews.

I think since models repo has object detection models, the next step should be object localization. For this I propose YOLOv3 and tinyYOLOv3. Some simple changes should allow inter-conversion between the two. I think this makes sense because they have the fastest inference time and we can train them using the procedure I have mentioned below. I think I can break the work down in phases so that it's a bit more coherent.

Phase 1 would be implementing the following:

1. DataLoader for Pascal VOC (I chose this because it has 20 classes and roughly 10k training images which should easier to train a model than training on COCO or ImageNet.)
2. Addition of Non-Maximal Suppression to mlpack.
3. Adding Darknet class (here Darkent-53) to models repo.

Addition of Darknet is necessary to facilitate training of YOLOv3. It would be a little time consuming to train tinyYOLOv3 using just NVBLAS so I think we could take the approach similar to ladder training.
We can first train the Darknet and then use those weights in tinyYOLOv3 for the darknet portion. This way we can break down training and get better results faster. This is something I did when I had to build and train a RetinaNet and YOLO from scratch for a month long hackathon.

Phase 2 would be implementing the following:
1. Addition of tinyYOLOv3.
2. During this phase I will be training Darknet as well incorporating your suggestions.

Phase 3 would result have the following:
1. Adding support for YOLOv3.
2. Add support for COCO for future use.
3. Here I will be training tinyYOLOv3.

Phase 4:
1. Some cool visualization tools using matplotlib-cpp to show off our results.
2. Adding inference support for videos.

Another thing that I can do is, get inference timing of tinyYOLOv3 and YOLOv3 on a raspberry-Pi and we can the result and a nice video / image to models repo.

Till then I will be completing all PRs that I have open and implement upsampling layer before GSOC.
For FPN, I will add it after GSOC or hopefully before GSOC ends if I finish the above task early.

After this project we models repo would be able to check one more task of their list. For users that want to run real-time inference on IoTs that don't necessarily support python, this will be incredibly useful. I think we can really show off mlpack in a nice object localization video. They would also be able to participate in COCO challenge.
I know this also might not be the best possible representation of project so I would love to incorporate any suggestions that you have.

Regards,
Kartik Dutt,
Github id: kartikdutt18
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://knife.lugatgt.org/pipermail/mlpack/attachments/20200310/b4a87a6a/attachment.htm>