mlpack  blog
Dataset and Experimentation Tools - Week 8

Dataset and Experimentation Tools - Week 8

Keon Kim, 20 July 2016

This week, I:

DatasetMapper & Imputer

1) Optimized Imputer a little bit. The details are discussed in the pull request #694.

2) Debugged and polished some comments.

Descriptive Statistics

1) Made statistics.hpp and statistics_impl.hpp, which is basically a more convinient version of armadillo statistics functions. It also has more features like calculating skewness and kurtosis. They are made to provide convinience, so the computational efficiency is little hurt. I made the results to sync with the results given by the excel. The commits I've done are in describe branch

2) The first version of the statistics class calculated every statistics at its constructor. The benchmark scores are recorded here.

3) Changed iomanip to boost::format for formatting the output.

I've been studying little more about how ANN and RNNs are implemented in mlpack (just personal interest). Deep learning is more fun than I thought, hopefully I can contribute to neural net parts of the mlpack in the future.

Later, I will work a little more on statistics module, mainly to optimize a little more and polish the comments and outputs.

And, I will work on mlpack_preprocess_verify executable, which is just a small extension of Imputer module. In this program, it does not change or replace any values, but only detects the invalid values.