A list of all recent posts this page contains.
- Automatically-Generated Go Bindings - Week 09
- Variational Autoencoders - Week 9
- Neural Collaborative Filtering - Week 9
- LMNN (via LRSDP) & BoostMetric Implementation - Week 9
- Implementing Essential Deep Learning Modules - Week 9
- Alternatives to Neighborhood-Based CF - Week 8
- Variational Autoencoders - Week 8
- Neural Collaborative Filtering - Week 8
- LMNN (via LRSDP) & BoostMetric Implementation - Week 7 & 8
- Automatically-Generated Go Bindings - Week 08
- Implementing Essential Deep Learning Modules - Week 8
- Automatically-Generated Go Bindings - Week 07
This week I finished passing back and forth the matrix, which entails I have working handwritten bindings for the PCA method. I have also dealt with small issue such how will we pass default value, how to return multiple values, as well as how will we print out documentation. For the first issue, since Go doesn't have method overloading or default value in function prototypes, I made a struct with the optional parameters which when initialized, aree initialized to their default values. For the second, go has built-in support for multilple value, therefore we will simply have multiple value returned in the funciton prototype and won't need to pass a dictionnary or data structure to hold the multiple output variables. For the last issue, we will use godoc, which is a tool that prints out the comment above un function prototype.This week, I will focus on finishing passing a model ptr to go, afterwhich I will start generating the C code!
I was finally able to get some good results with the network. I used the
MeanSquaredError for these results. Training with
ReconstructionLoss generates barely recognizable digits. Sumedh and I were thinking that this might be due to some fundamental faults in the way we are modelling the distribution for the output.
I realized that I was using
ReLU activation after the decoder and normalizing the data to (-1, 1). This was the reason that the reconstruction loss wasn't decreasing. After removing the activation, it trained well and here are some results.
Passing a random gaussian sample to the decoder:
Varying a latent variable of a gaussian sample continuosly:
To work with the decoder seperately from the network, I first thought about serializing
parameters in the
Sequential layer but later realized that it only acts as a container and the
parameters member is empty. After discussing it with Marcus, I have decided to solve this by overloading the
Forward() function of the FFN class. This new definition will take additional arguments, the starting and ending index of the layers to forward pass through.
I am currently debugging a VAE model using convolutional layers. Once done, I think it will give better results.
Another week is over and I think we are a few more steps closer to having Neural Collaborative Filtering implemented in mlpack. This week I modified the
Train() method, added
Evaluate() based on suggestions from Marcus so that we can deal with training instances within epochs. I also modified the current
NCF class to accept an AlgorithmType and create a network according to the user specified algorithm. I also completed
ncf_main so that we can access the NCF algorithms from command line interface. I have also added Neural matrix factorization to the existing
So now that the basic structure is ready, I am currently spending time on debugging the code. I had initially focussed on making the model work on implicit feedback, but now I am making slight modifications to generalize it to both implicit and explicit feedback cases. This week I intend to work on all the errors and small changes required so that by end of the week we have a working NCF class, trainable on multiple optimizers for both implicit and explicit data.
Last week, Bounds for LMNN were the topic of prime focus. We started off by debugging bounds for slack term PR, eventually we found out a plenty of problems to be dealt with & quite a number of questions to be answered. Ultimately after answering all the issues in an orderly fashion, we were able to develop much more efficient and scalable bounds for the slack term of optimization problem. As a result, we now have a much higher pruning rate of inactive constraints. Apparantely we have seen that the bounds are highly dataset dependent, that being so, pruning can range from as low as 0 prunes to nearly pruning all the inactive constraints. Generally the number increases as we go up the iterations, making it more or less to a constant value.
We also carried out some correctness and speedups tests. And it is evident, that these bounds have overall added a significant value to LMNN efficiency.
Simultaneously, bounds for avoiding impostors recalculation were also put into effect. Fortunately, they didn't presented us with much of hurdles and are just about ready to get merged, though a good number of merge conflicts needs to be handle as #1461 merges.
Hopefully, we will see LMNN growing a lot more in the upcoming days :)
Building up on our work from the last week on optimizing our
ANN framework, we went forward with implementing
EvaluateWithGradient() function for the
RNN classes as well.
Though we had done the same with the aim of reducing code duplication in mind initially, we realized that with the above function implemented, we were able to obtain atleast a
30% speedup in the case of simple
For the case of
RNN class, the speedup was slightly lower at
25%, primarily because of the heavier gradient computation routines being used. Nevertheless, we also applied the above function inside our
GAN::EvaluateWithGradient() function, so a certain amount of speedup is expected there as well!
I also received my Phase II evaluations this week, and I'm glad that Marcus is satisfied with the effort that we have put in. I will continue to build up on my work on
RBMs and hopefully, we can merge them as well before this month ends.
This week I was working on implementation of
BiasSVD model is very similar to
class BiasSVD is implemented by modifying
class RegularizedSVD. To test whether it works I have added some simple tests and I will complete all tests later.
Besides, I also worked on
SVDPlusPlus and wrapper classes for these two new models (i.e.
SVDPlusPlusPolicy). These are also close to be completed. I will finish the remaining work and push the codes after my trip back from Europe:)
When using the
Sequential object for the encoder and the decoder, it kept erroring out. I corrected the
Gradient function of that object. Also, the encoder wasn't participating in the backward pass at all. It was because the
Backward() helper function of FFN class does not go over the first layer of the network as it's not needed in most cases. So, Marcus suggested we use an
IdentityLayer before the encoder. Another mistake which went unnoticed earlier was that in the
Loss() function of the
Reparametrization layer, the KL loss was being always added to the total loss, even when includeKl was false. I corrected that. To make keeping track of training progress easier, I overloaded the
Evaluate() function of the FFN class. The new definiton takes input(
predictors) and target(
responses) and returns the loss with the current parameters. I think it would have taken a longer to debug this whithout Sumedh's help.
I trained a VAE model with fully connected layers on 90% of MNIST for about 5 hours. I expected it to at least generate some blurry but distinguishable digits. Sadly, on seeing the results, the images seemed to just have random noise. I am currently trying to figure out what's going wrong. Also, I am seeing some weird trends with the total loss while training.
The second phase has ended and at this point I think we are very much close to having a basic implementation of
NCF in mlpack. I spend this week mainly making modifications to the
GetRecommendations() method and creating the
EvaluateModel() method. They have been completed and pushed.
EvaluateModel() now evaluates the model on two parameters,
hit ratio and
RMSE. But the
Train() method hasn't been completed yet, slight modifications are still necessary to add
Evaluate() in NCF, and work on it is ongoing with input from Marcus. So the entire class can be tested once
Train() is complete.
Right now I am also working on
ncf_main, this will hopefully help us use
NCF from command line interface too. By end of this week I intend to have a proper trainable
NCF so that all methods can be tested and the network evaluated. There might be some debugging necessary after
Train() is completed. But apart from that the basic class, along with CLI is expected to be ready by end of the week.
We started the week by taking the well-known policy of divide and conquer into account in order to safely handle massive optimization tasks. That being so Ryan opened several issues dealing with each individual task.
The optimization tasks mainly cover imposing bounds over the data points wherever possible, caching & adapting (in the newly transformed space) the reference & query trees to avoid re-construction of trees on every call to
Impostors() and verifying the correctness of low-rank optimization. As of now, most of the bounds are derived (Thanks to Ryan!) and tested successfully. Apparently, results depend on the dataset considerably. Low-rank optimization also seems to work pretty decently. Hopefully, we will see some more decrement in runtime during the upcoming week.
And the best part is, we finally have LMNN merged. Thanks to Ryan and Marcus, the code and documentation were thoroughly fine-tuned before the merge. Probably, shortly we will start with Boostmetric implementation as well.
Last week, I continued to focus on matrices, and was finally able to pass a gonum object from Go to C++ and wrap it into an armadiillo matrix. Further, I was also able to pass back a matrix and wrap a Golang matrix made of n-dimensional float 64 array. I ended my workweek at trying to pass back a gonum matrix. The underlying data member of a gonum matrix is a blas64.General matrix. A blas65.General data member has a row and column capacity bigger than its actually row and column length and thus, passing back the matrix just by wrapping around the armadillo pointer a float64 ndimensional array is not sufficient. I am therefore planning on dealing with this issue today and tomorrow if needed. After that, I will be dealing with mlpack's method who return multiple output. In Python, a dictionary is used to deal with the later, but in Go, multiple return values have built-in support, making me think it should be fairly easy to have working for the bindings. However, to pass matrices, we are using unsafe.Pointers and using these has proved to have some unexpected side effects at times. Therefore, I don't want to exclude the fact that some functions might be needed to make sure that the return of multiple values works properly in the bindings. After making sure multiple values are being passed as expected, I will start dealing with passing models from mlpack to Go!
This week, we benchmarked the performance of our
GAN module against
Tensorflow's runtimes, and worked out on optimizing the routines even further. Then, we went forward with implementing
EvaluateWithGradient() function for all the variants, which gave us a straight performance improvement of
13% over the previous update routine, cutting almost
45 minutes of training time.
Tensorflow has a training time of
4.5 hours (multi-threaded) and about
11 hours (single core aggregate), whereas
mlpack has a runtime of
6.25 hours (single-threaded). We (Marcus, Ryan and Sumedh) have been discussing on parallelizing the
FFN class in order to benchmark in a multi-threaded environment as well. However, we decided to go forward with implementing as many modules as we currently can, and later optimizing them as we go on benchmarking.
RBM PR currrently passes tests for the stochastic input, and would have to be optimized for mini-batches, which would be done in Phase III. Phase II ends here, and I'm really glad that we were able to complete our planned goals so soon!
Last week I have focused on passing matrices. I was able to copy them using first class array in Go. With the help of Ryan, we are now also able to pass a matrix using the advanced armadillo constructor and a const_cast hack. I am having a lot of difficulty passing a Gonum matrix as CGO does not allow the passing of go object that have underlying go pointers in their structure, which has resulted in a "cgo argument has Go pointer to Go pointer" panic when running the program. However, Ryan has pointed out to me that a gonum matrix holds a float64 data member. I have also started to look at how to pass the memory pointer back to Go. Thus, I will try to pass the matrix as a float64 structure this week and will look at passing back matrices from C++ to Go without having to copy memory.