[mlpack] GSoC 2014 : Introduction and Interests

Anand Soni anand.92.soni at gmail.com
Tue Mar 4 11:49:32 EST 2014


Hi,

I built the mlpack environment and tried the all k nearest neighbour
search for iris data. I am still exploring and analyzing the results.
As mentioned in the project description, we need to implement methods
to compare accuracies of algorithms. I have a few ideas. I don't know
if they are useful here. I am exploring more.

1. Accuracy, Precision and recall, n-fold cross-validation. (Basic stuff)
2. Area under ROC Curves (Receiver Operating Characteristics)
[Probability that classifier will rank a randomly chosen positive
instance higher than a randomly chosen negative
instance.]
3. Information theoretic metrics [Still exporing] like : Good's
Information reward (for binary classification algorithms)

There are many other possibilities like Bayesian models and
statistical confidence intervals which can be used for such purposes.
I need more clarifications on the expectations from this project so
that I can do my research in the correct direction before the
proposal. I will be glad if someone can help.

Regards.

Anand

On Tue, Mar 4, 2014 at 12:26 AM, Ryan Curtin <gth671b at mail.gatech.edu> wrote:
> On Tue, Mar 04, 2014 at 12:19:30AM +0530, Anand Soni wrote:
>> Ryan,
>>
>> I think that the gatech server is down or not responding. I am not
>> even able to access www.gatech.edu . I will try a bit later and it
>> should work. Thanks a lot, by the way.
>
> Ok; let me know if you have continued issues.  I am able to access it,
> but I'm right here on campus, so there's probably some issue between
> here and where you are.  Hopefully it will be resolved soon...
>
> --
> Ryan Curtin    | "More like a nonja."
> ryan at ratml.org |   - Pops



-- 
Anand Soni | Junior Undergraduate | Department of Computer Science &
Engineering | IIT Bombay | India



More information about the mlpack mailing list