mlpack  blog
mlpack Documentation

Table of Contents

A list of all recent posts this page contains.

Automatically-Generated Go Bindings - Week 09

Automatically-Generated Go Bindings - Week 09

This week I finished passing back and forth the matrix, which entails I have working handwritten bindings for the PCA method. I have also dealt with small issue such how will we pass default value, how to return multiple values, as well as how will we print out documentation. For the first issue, since Go doesn't have method overloading or default value in function prototypes, I made a struct with the optional parameters which when initialized, aree initialized to their default values. For the second, go has built-in support for multilple value, therefore we will simply have multiple value returned in the funciton prototype and won't need to pass a dictionnary or data structure to hold the multiple output variables. For the last issue, we will use godoc, which is a tool that prints out the comment above un function prototype.This week, I will focus on finishing passing a model ptr to go, afterwhich I will start generating the C code!

Variational Autoencoders - Week 9

Variational Autoencoders - Week 9

I was finally able to get some good results with the network. I used the MeanSquaredError for these results. Training with ReconstructionLoss generates barely recognizable digits. Sumedh and I were thinking that this might be due to some fundamental faults in the way we are modelling the distribution for the output.

I realized that I was using ReLU activation after the decoder and normalizing the data to (-1, 1). This was the reason that the reconstruction loss wasn't decreasing. After removing the activation, it trained well and here are some results.

Passing a random gaussian sample to the decoder:

Varying a latent variable of a gaussian sample continuosly:

To work with the decoder seperately from the network, I first thought about serializing parameters in the Sequential layer but later realized that it only acts as a container and the parameters member is empty. After discussing it with Marcus, I have decided to solve this by overloading the Forward() function of the FFN class. This new definition will take additional arguments, the starting and ending index of the layers to forward pass through.

I am currently debugging a VAE model using convolutional layers. Once done, I think it will give better results.

Neural Collaborative Filtering - Week 9

Neural Collaborative Filtering - Week 9

Another week is over and I think we are a few more steps closer to having Neural Collaborative Filtering implemented in mlpack. This week I modified the Train() method, added Gradient() and Evaluate() based on suggestions from Marcus so that we can deal with training instances within epochs. I also modified the current NCF class to accept an AlgorithmType and create a network according to the user specified algorithm. I also completed ncf_main so that we can access the NCF algorithms from command line interface. I have also added Neural matrix factorization to the existing NCF algorithms.

So now that the basic structure is ready, I am currently spending time on debugging the code. I had initially focussed on making the model work on implicit feedback, but now I am making slight modifications to generalize it to both implicit and explicit feedback cases. This week I intend to work on all the errors and small changes required so that by end of the week we have a working NCF class, trainable on multiple optimizers for both implicit and explicit data.

LMNN (via LRSDP) & BoostMetric Implementation - Week 9

LMNN (via LRSDP) & BoostMetric Implementation - Week 9

Last week, Bounds for LMNN were the topic of prime focus. We started off by debugging bounds for slack term PR, eventually we found out a plenty of problems to be dealt with & quite a number of questions to be answered. Ultimately after answering all the issues in an orderly fashion, we were able to develop much more efficient and scalable bounds for the slack term of optimization problem. As a result, we now have a much higher pruning rate of inactive constraints. Apparantely we have seen that the bounds are highly dataset dependent, that being so, pruning can range from as low as 0 prunes to nearly pruning all the inactive constraints. Generally the number increases as we go up the iterations, making it more or less to a constant value.

We also carried out some correctness and speedups tests. And it is evident, that these bounds have overall added a significant value to LMNN efficiency.

Simultaneously, bounds for avoiding impostors recalculation were also put into effect. Fortunately, they didn't presented us with much of hurdles and are just about ready to get merged, though a good number of merge conflicts needs to be handle as #1461 merges.

Hopefully, we will see LMNN growing a lot more in the upcoming days :)

Implementing Essential Deep Learning Modules - Week 9

Implementing Essential Deep Learning Modules - Week 9

Building up on our work from the last week on optimizing our ANN framework, we went forward with implementing EvaluateWithGradient() function for the FFN and RNN classes as well.

Though we had done the same with the aim of reducing code duplication in mind initially, we realized that with the above function implemented, we were able to obtain atleast a 30% speedup in the case of simple FFN networks!

For the case of RNN class, the speedup was slightly lower at 22% ~ 25%, primarily because of the heavier gradient computation routines being used. Nevertheless, we also applied the above function inside our GAN::EvaluateWithGradient() function, so a certain amount of speedup is expected there as well!

I also received my Phase II evaluations this week, and I'm glad that Marcus is satisfied with the effort that we have put in. I will continue to build up on my work on RBMs and hopefully, we can merge them as well before this month ends.

Mirupafshim

Alternatives to Neighborhood-Based CF - Week 8

Alternatives to Neighborhood-Based CF - Week 8

This week I was working on implementation of BiasSVD and SVD++. As BiasSVD model is very similar to RegularizedSVD model, class BiasSVD is implemented by modifying class RegularizedSVD. To test whether it works I have added some simple tests and I will complete all tests later.

Besides, I also worked on SVDPlusPlus and wrapper classes for these two new models (i.e. BiasSVDPolicy and SVDPlusPlusPolicy). These are also close to be completed. I will finish the remaining work and push the codes after my trip back from Europe:)

Variational Autoencoders - Week 8

Variational Autoencoders - Week 8

When using the Sequential object for the encoder and the decoder, it kept erroring out. I corrected the Gradient function of that object. Also, the encoder wasn't participating in the backward pass at all. It was because the Backward() helper function of FFN class does not go over the first layer of the network as it's not needed in most cases. So, Marcus suggested we use an IdentityLayer before the encoder. Another mistake which went unnoticed earlier was that in the Loss() function of the Reparametrization layer, the KL loss was being always added to the total loss, even when includeKl was false. I corrected that. To make keeping track of training progress easier, I overloaded the Evaluate() function of the FFN class. The new definiton takes input(predictors) and target(responses) and returns the loss with the current parameters. I think it would have taken a longer to debug this whithout Sumedh's help.

I trained a VAE model with fully connected layers on 90% of MNIST for about 5 hours. I expected it to at least generate some blurry but distinguishable digits. Sadly, on seeing the results, the images seemed to just have random noise. I am currently trying to figure out what's going wrong. Also, I am seeing some weird trends with the total loss while training.

Neural Collaborative Filtering - Week 8

Neural Collaborative Filtering - Week 8

The second phase has ended and at this point I think we are very much close to having a basic implementation of NCF in mlpack. I spend this week mainly making modifications to the GetRecommendations() method and creating the EvaluateModel() method. They have been completed and pushed. EvaluateModel() now evaluates the model on two parameters, hit ratio and RMSE. But the Train() method hasn't been completed yet, slight modifications are still necessary to add Gradient() and Evaluate() in NCF, and work on it is ongoing with input from Marcus. So the entire class can be tested once Train() is complete.

Right now I am also working on ncf_main, this will hopefully help us use NCF from command line interface too. By end of this week I intend to have a proper trainable NCF so that all methods can be tested and the network evaluated. There might be some debugging necessary after Train() is completed. But apart from that the basic class, along with CLI is expected to be ready by end of the week.

LMNN (via LRSDP) & BoostMetric Implementation - Week 7 & 8

LMNN (via LRSDP) & BoostMetric Implementation - Week 7 & 8

We started the week by taking the well-known policy of divide and conquer into account in order to safely handle massive optimization tasks. That being so Ryan opened several issues dealing with each individual task.

The optimization tasks mainly cover imposing bounds over the data points wherever possible, caching & adapting (in the newly transformed space) the reference & query trees to avoid re-construction of trees on every call to Impostors() and verifying the correctness of low-rank optimization. As of now, most of the bounds are derived (Thanks to Ryan!) and tested successfully. Apparently, results depend on the dataset considerably. Low-rank optimization also seems to work pretty decently. Hopefully, we will see some more decrement in runtime during the upcoming week.

And the best part is, we finally have LMNN merged. Thanks to Ryan and Marcus, the code and documentation were thoroughly fine-tuned before the merge. Probably, shortly we will start with Boostmetric implementation as well.

Automatically-Generated Go Bindings - Week 08

Automatically-Generated Go Bindings - Week 08

Last week, I continued to focus on matrices, and was finally able to pass a gonum object from Go to C++ and wrap it into an armadiillo matrix. Further, I was also able to pass back a matrix and wrap a Golang matrix made of n-dimensional float 64 array. I ended my workweek at trying to pass back a gonum matrix. The underlying data member of a gonum matrix is a blas64.General matrix. A blas65.General data member has a row and column capacity bigger than its actually row and column length and thus, passing back the matrix just by wrapping around the armadillo pointer a float64 ndimensional array is not sufficient. I am therefore planning on dealing with this issue today and tomorrow if needed. After that, I will be dealing with mlpack's method who return multiple output. In Python, a dictionary is used to deal with the later, but in Go, multiple return values have built-in support, making me think it should be fairly easy to have working for the bindings. However, to pass matrices, we are using unsafe.Pointers and using these has proved to have some unexpected side effects at times. Therefore, I don't want to exclude the fact that some functions might be needed to make sure that the return of multiple values works properly in the bindings. After making sure multiple values are being passed as expected, I will start dealing with passing models from mlpack to Go!

Implementing Essential Deep Learning Modules - Week 8

Implementing Essential Deep Learning Modules - Week 8

This week, we benchmarked the performance of our GAN module against Tensorflow's runtimes, and worked out on optimizing the routines even further. Then, we went forward with implementing EvaluateWithGradient() function for all the variants, which gave us a straight performance improvement of 13% over the previous update routine, cutting almost 45 minutes of training time.

Currently, Tensorflow has a training time of 4.5 hours (multi-threaded) and about 11 hours (single core aggregate), whereas mlpack has a runtime of 6.25 hours (single-threaded). We (Marcus, Ryan and Sumedh) have been discussing on parallelizing the FFN class in order to benchmark in a multi-threaded environment as well. However, we decided to go forward with implementing as many modules as we currently can, and later optimizing them as we go on benchmarking.

The RBM PR currrently passes tests for the stochastic input, and would have to be optimized for mini-batches, which would be done in Phase III. Phase II ends here, and I'm really glad that we were able to complete our planned goals so soon!

Totsiens

Automatically-Generated Go Bindings - Week 07

Automatically-Generated Go Bindings - Week 07

Last week I have focused on passing matrices. I was able to copy them using first class array in Go. With the help of Ryan, we are now also able to pass a matrix using the advanced armadillo constructor and a const_cast hack. I am having a lot of difficulty passing a Gonum matrix as CGO does not allow the passing of go object that have underlying go pointers in their structure, which has resulted in a "cgo argument has Go pointer to Go pointer" panic when running the program. However, Ryan has pointed out to me that a gonum matrix holds a []float64 data member. I have also started to look at how to pass the memory pointer back to Go. Thus, I will try to pass the matrix as a []float64 structure this week and will look at passing back matrices from C++ to Go without having to copy memory.