[mlpack] GSoC 2015: Fast k-centers algorithm & implementation

Lukas Zorich lukas.zorich at gmail.com
Mon Feb 23 01:26:01 EST 2015


Hi everyone,

My name is Lukas Zorich, an undergraduate Computer Science student at the
Pontificia Universidad Católica of Chile. I have taken Discrete Math and
Data Structures and Algorithms course at my University. My Machine Learning
interest started after taking the online course on Machine Learning by
Andrew NG on Stanford Online, which I'm taking now. I'm very excited about
learning as much as possible in this field, that's why next semester at
college I'm planning to take Artificial Intelligence, and that's why I want
to contribute as a GSoC student to mlpack project.

Reading through the GSoC 2015 ideas, the project that interests me
most is *Fast
k-centers algorithm & implementation*. This seems that a very interesting
and fun project to work on, but it also looks difficult and challenging. I
wanted to know if someone that didn't know about Dual-Tree Boruvka MST
algorithm till today and with little Machine Learning experience, but very
motivated and willing to learn all the necessary stuff to complete
successfully the project will be able to handle it (the project
descriptions says that some understanding of geometry and spatial data
structures is necessary: I know about kd-trees and also I have worked with
octrees in a college project before). Don't get me wrong, the fact that
this is a challenging project (for me) motivates me a lot, but I also want
to be realistic.

I already downloaded and built mlpack from source code. I also started to
read some papers about dual-tree algorithms [1] and [2], to understand how
this algorithms work. And, as previous mailing list post [3] says, I also
going to look at the tree abstractions in mlpack (src/mlpack/core/tree/)
and the dual-tree algorithms mlpack implements, in particular DualTreeBoruvka
class in src/mlpack/methods/emst/. Are there any other important resources
I should look at? Maybe some dual-tree algorithm papers or other important
resources to learn more about this topics would be great!

I'm really excited to contribute to the community and to learn as much as I
can.

Hoping to hear from you soon,

Lukas.

[1]: Fast Euclidean Minimum Spanning Tree: Algorithm, Analysis, and
Applications - http://www.cc.gatech.edu/~pram/pubs/rp494b-march.pdf
[2]: Tree-Independent Dual-Tree Algorithms -
http://arxiv.org/pdf/1304.4327.pdf
[3]: https://mailman.cc.gatech.edu/pipermail/mlpack/2014-March/000330.html
-- 
Lukas Zorich
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cc.gatech.edu/pipermail/mlpack/attachments/20150223/ce45ea3b/attachment-0002.html>


More information about the mlpack mailing list