[mlpack] Ideas on AdaBoost developments

Mj Liu noodleslik at gmail.com
Wed Mar 5 20:07:15 EST 2014


Hi all,
      I have to thank 闫林(godspeed1989 at gmail.com) and Ryan Curtin(
gth671b at mail.gatech.edu) for the suggestion and caring of my question of
application of GSOC. I have checked the GitHub repository and learned a
lot, thanks.
    Recently,  I went through the tutorial part of mlpack, and tried to
compile the package, and checked several methods of mlpack.  I was thinking
what the phylosophy of designing the mlpack, which means why shall we
develop the mlpack and how shall we present it to the users.
     And I have several suggestions on the designing of AdaBoost part:

   - The aim of AdaBoost part shall like any other methods provide:
      - 1)  a simple command-line executable with weak learners can be set
      by parameters
      - 2)  a simple C++ interface
      - 3)  a generic, extensible, and powerful C++ class (AdaBoost) for
      complex usage
   - The weak learners shall be developed separate from AdaBoost part.
   Which means the weak learners itself shall provide self-standing
   functionality. And AdaBoost is just another method which can improve the
   result of these weak learners.
   - The AdaBoost shall provide multi-thread version as well as general
   single-thread version. Since multi-core computers are widely used in the
   industry and research centers, and the AdaBoost method itself would run the
   same procedure several times, so multi-thread is reasonable. And also
   single-thread is also needed for small problems or small data sets I think.
   (I would thank xxx for providing the AdaBoost repository from GitHub, thx
   :~) )
   - As I checked the method of gmm and knn and other methods, I was
   thinking that was it possible to build a uniform Interface for all of the
   methds like an interface "Algrithms.hpp". The uniform interface shall
   provide the uniform mechanism of how to be called by users, like a method
   "run(),  load(), save()". with the uniform interface it would easy to learn
   for all of the methods.
   - I think the mlpack shall be independent from "arma" which I suggest we
   shall porting some basic methods from "arma" to the mlpack. Then it would
   quite easy to install the mlpack, it would save much time from preparation
   of building.
   - I think the work of developing AdaBoost shall be seperate into several
   periodes:
      - 1) there shall be one or two weak learner's implementation;
      - 2) then AdaBoost shall be implemented in the form of single-thread;
      -  3) multi-thread version of AdaBoost shall be implemented;
      - 4) more weak learners shall be added to mlpack.

      I would like to  thank  闫林(godspeed1989 at gmail.com)  and  Ryan Curtin(
gth671b at mail.gatech.edu) again for the comments last time.
     Thanks for reading and thanks for your time.  Welcome any comments.

    Best Regards!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cc.gatech.edu/pipermail/mlpack/attachments/20140306/33774ea7/attachment-0002.html>


More information about the mlpack mailing list