mlpack  blog
mlpack Documentation

Table of Contents

A list of all recent posts this page contains.

Variational Autoencoders - Summary

Variational Autoencoders - Summary

Sorry to have forgotten to put the summary on the blog. I had made a repository for the work report. Here it is: https://github.com/akhandait/GSoC-2018

Thanks again everybody for the awesome summer! It went by in a blur. :) I hope I get to do more amazing things with mlpack.

Neural Collaborative Filtering - Summary

Neural Collaborative Filtering - Summary

Overview

The project started with the main aim of implementing a new bunch of algorithms to mlpack which can implement collaborative filtering using neural networks based on the paper Neural collaborative filtering by Xiangnan He, Lizi Liao, Hanwang Zhang, Liqiang Nie, Xia Hu and Tat-Seng Chua (2017). . The algorithms under neural collaborative filtering was expected to enable usage of implicit feedback along with explicit to train models for recommendations. The model was also expected to represent complex user item interactions which normally get missed out when using matrix factorization based collaborative filtering methods.

Implementation

CF class refactoring

One of the initial challenges was the integration of new collaborative filtering methods to existing class. It was decided upon to modify the CF class to a policy based design where a template parameter DecompositionPolicy is used to perform CF using different decomposition methods. The Pull Request #1355 focussed on this and we were able to merge the PR with some useful functionalities such that in future, adding new decomposition methods to CF will be much easier. The PR also added RandomizedSVD as a decomposition policy in CF and modified the CF class tests to accommodate these changes.

New layers in ann module

Another challenge to the project was the unavailability of some necessary layers for creating the NCF networks, like Embedding, MultiplyMerge, Flatten etc. After much discussion, the lookup layer in mlpack was aliased to use as embedding layer in Pull Request #1401. MultiplyMerge layer was created for performing element wise multiplication while merging two networks (Pull Request #1392).

The NCF networks deal with two inputs, user and item and thus the structure which needs to be accommodated cannot be trained with normal Train() since it takes only one input. It was decided to create a Subview layer which can split input data within the network so that this issue can be managed. The subview layer acts as an intermediate layer with functions of creating submatrix, vectorizing, flattening etc. The Pull Request #1428 implements this layer and I believe this is a very much useful addition to the codebase. The subview layer was further modified in Pull Request #1435 to adapt it to matrices and enable batch support.

NCF class

The major part of the project, the neural collaborative filtering class was added in Pull Request #1454. The class started of with basic data members and member functions and methods were added to it during the course of the project. The class currently has the following major methods:

  • FindNegatives() - Collect data of items which haven’t been rated by a user and store them in a vector, to be used later while collecting instances to train.
  • GetTrainingInstance() - According to whether the user requires the model to be for implicit or explicit feedback, this method creates instances for training, maintaining a user defined ratio between positive (rated) and negative (non rated) instances. The method returns predictors and responses for the network to be trained on.
  • CreateGMF(), CreateMLP(), CreateNeuMF() - These are the methods for creating networks according to the algorithm required. They have unique network structures and implement it using Sequential, MultiplyMerge, Linear, Concat, Subview, Embedding layers.
  • GetRecommendations() - Generate n recommendations based on the trained model and predicted rating from it.
  • EvaluateModel() - This method evaluates the model based on the recommendations it generates, on parameters like RMSE and hit ratio.
  • Train(), Gradient() and Evaluate() - These functions enable training of the model from NCF such that the training instances can be changed in between each epoch, and the calls can be forwarded to corresponding functions of FFN.

Command Line Interface

NCF has a command line interface implemented through ncf_main in Pull Request #1454. It lets the user to use any dataset for training, testing etc where the user can specify whether the dataset is to be considered as implicit or explicit, the algorithm to use among GMF, MLP and NeuMF, the number of negative instances per positive instance to train upon, the embedding size to be used in the network, the optimizer to use while training the network etc. It was decided to have a seperate CLI for NCF since the arguments and functions required by CF and NCF have not much in common.

CLI usage

The command line interface provides many options to use the NCF class.

  • –training (-t) defines the training dataset given as a csv file.
  • –algorithm (-a) parameter defines the algorithm to be used which could be 'GMF', 'MLP' or 'NeuMF'.
  • –all_user_recommendations (-A) can be set for generating recommendations for all users and –query (-q) for a specific set of users.
  • –implicit (-i) flag sets whether the dataset's rating is to be treated as implicit or explicit feedback. If dataset contains explicit rating and the flag is set, the rating will be converted to implicit.
  • –recommendations (-c) can be used to set the number of recommendations to be generated.
  • –optimizer (-z) can be used to choose an optimizer from among 'adagrad', 'rmsprop', 'adam', 'SGD'.

Example usage:

  • mlpack_ncf –help : Get full documentation.
  • mlpack_ncf -t "ml-100.csv" -A -o "recommendation.csv" -c 10: Train on ml-100.csv, save 10 recommendations each for all users to recommendations.csv.
  • mlpack_ncf -t "ml-100.csv" -a "GMF" -z "SGD" -t "ml-100test.csv" : Train using GMF algorithm and SGD optimizer and test on dataset ml-100test.csv.

Other Work

Some time was spent on dataset collection and modification to suit the requirements of implicit feedback data. Pull Request #1422 was opened to create a wrapper NCFNetwork class to enable using multiple input matrices to same network but was later discarded since Subview layer was used to come up with an alternate plan.

Results

Currently, GMF is giving an RMSE of 3.3 on single epoch on movielens 100k dataset. Multiple epoch training and RMSE reduction is ongoing work.

Summary

Future Work

  • Reduce execution time of methods in NCF like GetTrainingInstances().
  • Improve RMSE and hit ratio.
  • Add proper batch support to training and testing.
  • Write tests for NCF and ncf_main.

Conclusion

It has been a wonderful summer working with mlpack. There has been a lot of coding, experimentation, a million builds and never ending debugging that to think that it is coming to a wrap up doesn’t feel much real. Now that I have a much better understanding of the mlpack codebase than when the summer began, I would like to keep contributing. Thanks a lot to Marcus for your constant support throughout the summer. You have made it loads easier to work, well, you have the solution to all errors :). I would also like to acknowledge the help from Ryan, who has always been available whenever needed, and the whole team of mlpack. This has been a summer worth remembering!

Automatically-Generated Go Bindings - Summary

Automatically-Generated Go Bindings - Summary

This following is the summary of my GSoC project.

My weekly progress blog posts can be found in the following link: http://mlpack.org/gsocblog/YasmineDumouchelPage.html

The Go Binding Generator

For the past 3 months, I have been implementing an Go binding generator. Like discussed in my weekly progress post, the generator produces three files for every binding:

  1. A .go file: This file is the go interface for Go users. It uses cgo in order to sharing data and communicate with C. When generated, the .go file is build in the mlpack/binding/go/mlpack/ directory.
  2. A .h file: This file is used as a C interface. This file includes the function which can be called by Go.
  3. A .cpp file: This file acts as 'glue code' between the C interface (the .h file) and mlpack methods' code.

Additionally, a method specific library is created the method_main.cpp in order for Go to call the mlpack_main() function.

Several utility files for the cli object and armadillo object have also been implemented. The .cpp, .hpp and .h utility files can be found in the mlpack/binding/go/mlpack/capi directory, and the .go utility files in the mlpack/binding/go/mlpack/ directory.

Pull Request

My pull request can be found here: https://github.com/mlpack/mlpack/pull/1492

This PR hasn't been merged as of yet, as I am not yet finish implementing the binding generator. More precisely, two specific parameter type have yet to be finished and implemented. Precisely the matrix with dataset info parameter type, which would allow to one to generate bindings for the hoeffding and decision tree method and the vector of strings and int parameter type. After these are implemented, additional test for these parameter would be needed. Furthermore, documenting the generator with example and tutorials for the user would also useful. I plan on doing those task and continuing the Go bindings generator until they are fully done.

How To Use The Binding Generator

  • GENERATING THE BINDING

First, to generate the bindings get mlpack in your go workspace:

$  go get https://github.com/yaswagner/mlpack/

Then in your terminal go to the mlpack folder create your build directory and build cmake as desired. For example:

$ cd ${GOPATH}/src/mlpack
$ mkdir build
$ cd build
$ cmake -D BUILD_GO_BINDINGS=ON -D BUILD_PYTHON_BINDINGS=OFF ../
$ make 

You can generate a specific binding to generate by using the mlpack_go_{method} target. For instance, for the pca method:

$ make mlpack_go_pca
  • USING THE BINDINGS

To use the bindings in Go, you must import the mlpack package and gonum package.

For example:

import (
  "mlpack/build/src/mlpack/bindings/go/mlpack"
    "gonum.org/v1/gonum/mat"
)

Optional parameters need to be set in the methods config struct and then we can set our desired parameters. For instance, when using the pca method, we initialize the config struct called param and set verbosity to true by doing:

param := mlpack.InitializePca()
param.Verbose = true

We can then pass our optional parameters and call the Pca method as such:

output := mlpack.Pca(input, param)

For output parameters we do not wish to use, we can use an underscore. For example, for the perceptron, if we wish to only use the 'output_model' and not 'output'"

_, output_model := mlpack.Perceptron(param)

In order to get documentation about a method, we use 'godoc'. For example, for the pca method we would type on the command line:

$ godoc mlpack/build/src/mlpack/bindings/go/mlpack Pca
  • TESTING THE BINDINGS

Finally, to test the binding, Go's 'testing' is used. Simply go to the go test file following directory as such:

$ cd ${GOPATH}/src/mlpack/src/bindings/go/tests/gofile

Then test by using the 'go test' command tool, as such:

$ go test -v

Acknowledgements

I want to thank mlpack for having given me the opportunity to work on this project. This was my first open source experience and I am feel beyond lucky to have learned as much valuable tools and knowledge as I did. I also want to thank my mentor, Ryan Curtin, for his help and guidance throughout the summer. I could not have asked more of a mentor. He has made this experience a fun one as well as a great learning opportunity. I look forward to continue implementing the bindings and perhaps even implementing a generator for another language sometime in the future!

Alternatives to Neighborhood-Based CF - Summary

Alternatives to Neighborhood-Based CF - Summary

Brief Summary

This blog summarizes the work I have done for my GSoC-2018 project Alternatives to Neighborhood-Based CF. The goal of my project is to add alternative algorithms to the Collaborative Filtering module in mlpack. The algorithms I have completed include different rating normalization methods, negihbor search methods, weight interpolation methods, and BiasSVD, SVD++ models.

Completed Algorithms

Data Normalization

With data normalization in CF, raw ratings are normalized before performing matrix decomposition. When predicting missing rating, data is 'denormalized' to original scale. As this benchmarking result shows, data normalization is important for improving the performance. The followings are brief explanations of different data normalization methods that have been implemented.

  1. NoNormalization : Default normalization class. It doesn't perform any data normalization.
  2. OverallMeanNormalization : Normalize ratings by substrating mean rating.
  3. UserMeanNormalization : Normalize ratings by substracting the corresponding user's mean rating.
  4. ItemMeanNormalization : Normalize ratings by substracting the corresponding item's mean rating.
  5. ZScoreNormalization : Perform z-score normalization on ratings.
  6. CombinedNormalization : Perform a sequence of normalization methods on ratings. For example, CombinedNormalization<OverallMeanNormalization, UserMeanNormalization, ItemMeanNormalization> performs OverallMeanNormalization, UserMeanNormalization, ItemMeanNormalization in sequence.

The code for data normalization can be found in folder mlpack/src/mlpack/methods/cf/normalization, or in PR #1397.

For more information on rating normalization, refer to this paper.

Neighbor Search

Only neighbor search of Euclidean distance was implemented before the start of this project. I refactored the code and added neighbor search methods of cosine distance and pearson correlation. The followings are brief explanations of different neighbor search methods that have been implemented.

  1. LMetricSearch : Searching with l_p distatnce is the general case of searching with Euclidean distance. EuclideanSearch is the alias of LMetricSearch<2>.
  2. CosineSearch : Search neighbors and return similarities using cosine distance.
  3. PearsonSearch : Search neighbors and return similarities using pearson correlation.

All simlarities returned by the methods above are restricted to be in the range [0, 1].

With normalized vectors, neighbor search of cosine/pearson distance is equivalent to neighbor search of Euclidean distance. Therefore, instead of performing neighbor search directly with cosine/pearson distance, vectors in reference/query set are normalized and then neighbor search of Euclidean distance is used.

The code for neighbor search policies can be found in folder mlpack/src/mlpack/methods/cf/neighbor_search_policies, or in PR #1410.

Weight Interpolation

Before the start of this project, predicted rating is calculated as the average of neighbor's ratings. I refactored the code and added two more weight interpolation algorithms: SimilarityInterpolation where weights are based on neighbor similarities, and RegressionInterpolation where weights are calculated by solving a regression problem. The followings are brief explanations.

  1. AverageInterpolation : Interpolation weights are identical and sum up to one.
  2. SimilarityInterpolation : Interpolation weights are calculated as normalized similarities and sum up to one.
  3. RegressionInterpolation : Interpolation weights are calculated by solving a regression problem.

With interpolation weights, the CF algorithm multiplies each neighbor's rating with it's weight and sums them to predict rating.

The code for weight interpolation policies can be found in folder mlpack/src/mlpack/methods/cf/interpolation_policies, or in PR #1410.

For more information on RegressionInterpolation, refer to this paper.

Bias SVD

BiasSVD is similar to regularizedSVD. The difference is that BiasSVD also models user/item bias. In BiasSVD, rating is predicted as

$ r_{iu} = + b_i + b_u + p_i * q_u $,

where $$ is mean rating, $b_i$ is item bias, $b_u$ is user bias, $p_i$ item latent vector, $q_u$ is user latent vector.

Same as RegularizedSVD, BiasSVD is opmitzed using Stochastic Gradient Descent (SGD).

The code for BiasSVD can be found in folder mlpack/src/mlpack/methods/bias_svd/, or in PR #1458.

SVD++

SVD++ is a more expressive model. Besides explicit ratings, SVD++ also takes implicit feedback as input and learns latent vectors to model implicit feedback. For each item, a latent vector is used to model the relationship between the item and a user in terms of implicit feedback.

In SVD++, rating is predicted as

$ r_{iu} = + b_i + b_u + p_i * (q_u + {t I(u)}{y_t}) $,

where $I(u)$ is the set of items which user $u$ has interacted with, and $y_t$ is the latent vector to model the implicit feedback.

Same as RegularizedSVD and BiasSVD, SVDPlusPlus is optimized using Stochastic Gradient Descent (SGD).

The code for BiasSVD can be found in folder mlpack/src/mlpack/methods/svdplusplus/, or in PR #1458.

Please read this paper for more explanation on SVD++.

Other modifications/refactoring

  1. To make addition of new cf models (e.g. BiasSVD, SVD++) easier, I refactored decomposition policies. The mofications are: 1) all model parameters are moved from class CFType<> to class DecompositionPolicy. 2) DecompositionPolicy has to implement method GetRating() to compute rating for given user and item. 3) DecompositionPolicy has to implement method GetNeighborhood to compute neighborhoods for given users. (This modification is in PR #1458).
  2. class CFModel is implemented to be used for cf main program. When mlpack_cf is executed from command line, CFModel is serialized instead of class CFType<>. class CFModel is needed for the main program because CFType is a class template. (This modification is in PR #1397).

So far the PRs (#1397, #1410) for data normalization, neighbor search, and weight interpolation have been merged. The PR (#1458) for BiasSVD and SVD++ is pretty long and is still under reviewing and debugging, but it will also be merged soon.

To Do

  1. Add supports for alternative normalization methods, neighbor search methods and weight interpolation methods in cf main program. Currently the cf main program only supports NoNormalization, EudlideanSearch and AverageInterpolation.
  2. Write automatic benchmarking scripts to compare the CF module in mlpack with other recommender system libraries.

Acknowledgements

Working on my GSoC project this summer has been really amazing and rewarding. This is the first time I contirbute to an open-source library and I've found the fun in it. I want to thank Mikhail especially for reviewing my codes and giving useful suggestion on specific implementation of algorithms. I also want to thank Marcus, Ryan, and all community members who have been super helpful in answering my questions. Although GSoC is about to come to the end, I will still make contributions to mlpack library and work on improving the cf module and implementing other algorithms.

Implementing Essential Deep Learning Modules - Week 12 and 13

Implementing Essential Deep Learning Modules - Week 12 and 13

We're finally here at the end of our GSoC journey. During the last two weeks as well, the progress had been a bit slow, as I became more involved in my semester studies and preparing for my industrial internship interviews.

We are currently still in progress with the CIFAR-10 test on SSRBM, and the DualOptimizer PR. I had a talk with Atharva with respect to refactoring the Gradients() method of the Convolution and AtrousConvolution layers. The PR is review ready and should be merged shortly.

Apart from these two tasks, we have been experiencing a test failure of the RBM module on the mlpack testbench, which we are investigating as well. These tasks would be worked upon after the GSoC period ends. I also had a couple of additional goals that I wished to implement, namely StackGAN, SeqGAN and Deep Belief Networks. I hope to stay with the community, and keep contributing to its infrastructure.

הֱיה שלום

LMNN (via Low-Rank optimization) & BoostMetric Implementation - Summary

LMNN (via Low-Rank optimization) & BoostMetric Implementation - Summary

Google Summer of Code Project Objectives :

Through my proposal, I aimed for implementing LMNN (Large Margin Nearest Neighbors) & BoostMetric distance metric learning techniques.

LMNN Implementation :

Initially, the goal was to have a LMNN implementation based upon Low-Rank SDP optimizer. Though, after some realizations, it eventually got converted into a low-rank linear optimization problem, completely removing the SDP projection step from the picture, meanwhile keeping the base idea of LMNN intact.

This low-rank formulation leads to an initial implementation (#1407 LMNN distance learning) which is generic in term of the optimizer, means most of the optimizer can easily be plugged into the LMNN, allowing LMNN to easily exploit most of there characteristics. For instance, the implementation allows the user to easily select one from AMSGrad, Big Batch SGD, SGD & L-BFGS by simply passing an optimizer flag.

The process just didn't stop there, many more exciting optimizations were still to be employed. The few starting ones include -

  • Pruning inactive constraints.
  • Employing batches while using SGD based optimizers such as AMSGrad, Big Batch SGD and SGD.

Here are some simulations we got from #1407 with a value of k as 3.

Dataset mlpack shogun matlab
Runtimes (secs) Runtimes (secs) Runtimes (secs)
iris 0.028843 1.340270 1.816999
satellite 6.099969 122.9106781161.872116
ecoli 0.020087 5.918733 90.625
vehicle 0.620096 18.766937 55.068948
balance 0.071944 10.840742 3.332948
letter 19.975593 6416.926464 -
Dataset mlpack shogun matlab
Accuracy (%) Accuracy (%) Accuracy (%)
iris 97.3333 97.33333 96.0
satellite 93.10023 94.7785593.58974
ecoli 91.071428 92.410714 90.625
vehicle 81.6785 80.023640 65.24822
balance 90.72 81.28 77.60
letter 97.0 97.095 -

Illustration of learning curve over seed dataset :

After finishing up #1407, the next step was to perform a number of other substantial optimizations and this lead to opening of several issues -

Out of those #1447, #1448 and #1449 were successfully handeled together with verifying correctness and speedups from each one of them. #1445 & #1446 are still in progress and hopefully will be completed in very near future.

The simulation (performed with same parameters as of above benchmarks) below depicts the current LMNN (after performing optimizations) performance:

Dataset
Runtimes (secs) Accuracy (%)
iris 0.014874 97.3333
satellite 2.802369 94.0793
ecoli 0.016140 93.75
vehicle 0.600228 78.8416
balance 0.074875 93.44
letter 31.89887 97.0

Here's an illustration of result of lmnn learning over a dummy dataset -

Finally, Here's a record of all PR's relevant to LMNN implementation -

  1. Merged :
  2. Open:
  3. Closed:

BoostMetric Implementation :

A major portion of the project time went into developing a novel implementation of LMNN, which led to a shortage of time for handling BoostMetric. Even though facing limited time constraint we were able to successfully implement it. Here's the BoostMetric PR #1487 Implementation of BoostMetric Distance Learning which is currently in the open state. Hopefully, it will be merged very soon as well, once we get sufficient results to convince ourselves. We even ran a number of simulations against our LMNN implementation to get an idea of what more can be improvised.

Highlights

  • The very initial task of framing the SDP problem into a low-rank linear optimization problem and then verifying its correctness, itself consumed quite an amount of time & mental resources.
  • Efficiently calculating the target neighbors & impostors for each data point was a challenge in itself. For this, we are already exploiting a major portion of the code & physical structure of nearest neighbors & binary space trees that are already there in mlpack. Perhaps we will see even further speedups, ones the work related to #1445 & #1446 completes.
  • Deriving & implementing bounds, furthermore seeing speedups coming through at the same time, was the phase that was the most exciting one. We derived numerous bounds for different LMNN terms (honestly most of them were the adaptation of initial bounds derived by my mentor, Ryan) and the best part is we got to see some good amount of speedups from each one of those. Though sometimes results were highly dataset dependent.
  • Truly speaking, performing simulations was one of the most tedious but can't deny, it was one of the most valuable as well. Sometimes it even used to take days to see the results. But I was fortunate enough to have a caring mentor and organization, who aided me at every point and even provided me the access to a good build system, which made this work a lot easier.
  • Visit all previous GSoC blogs.

Conclusion

Ah, this was one of the best experience I ever had. I didn't thought that these 3 months will pass this quickly. Each and every day was full of new encounters. I couldn't have expected more. I am thankful to Ryan, Marcus & whole of the awesome mlpack organization for this wonderful period of time. Thanks, Ryan for always being there and helping me out at each and every step. I appreciate all the help you provided and all the patience you kept with me. Without your thoughtful & clever ideas, I don't think the project could have been in the state, it is now. Also, a big thanks to Marcus for helping out every time I asked. I really appreciate all you did for me when I was stuck with benchmarking scripts. Finally, I am grateful to Google for providing me a once in a lifetime experience, eventually making me more comfortable with open source. GSoC is definitely a novel program helping thousands of students worldwide. (◕‿◕)

LMNN (via Low-Rank optimization) & BoostMetric Implementation - Week 11 & 12

LMNN (via Low-Rank optimization) & BoostMetric Implementation - Week 11 & 12

Past two weeks we were working on finishing things up related to LMNN optimizations. We got some of the optimizations, mostly related to bounds, completely validated (in terms of correctness & speedups) and merged. Though work related to tree-based optimizations is still in progress. Hopefully, we will be able to successfully complete it shortly. Additionally, Ryan has been working over the LMNN target neighbors and impostors rules in order to avoid multiple passes of KNN during there computation. He has already witnessed significant speedups in this regard.

Unfortunately, we are still seeing some bugs in the LMNN code related to BigBatchSGD optimizer, leading to build timeouts or more specifically execution is getting stuck at some point. Most likely the problem is related to shuffling of data points leading to changes in objective function. Rest assured, we are determined to solve this at the earliest.

Meanwhile, a PR related to BoostMetric has been opened. And perhaps it will be in its final stages very soon. Possibly some results validation and some amount of code optimization, wherever applicable will need to be taken care of.

Hopefully, we will be able to achieve all our goals we aimed for.

Automatically-Generated Go Bindings - Week 12

Automatically-Generated Go Bindings - Week 12

During the past week, I have mostly been implementing the function that would print the bindings for every methods (the .cpp, .h, and the .go binding file). Tuesday, Ryan made me realize that the perceptron model passing was not working fully, in fact, I was able to pass a model from Go to C++, but passing back the 'output_model' parameter as the 'input_modelÍ parameter was not working correctly. Thus, I made sure the latter issue was fixed. As for the generator, I have finish the program which prints the binding files, with the exception of printing the documentation. I am still not done with the generator, so I will continue working on it this week. I will finish implementing the printing of the documentation and will start testing them. I will also need to make sure it compiles with Go and that CGO links properly with the bindings and thus, will probably require some adjusting of the bindings or the CMake files.

Variational Autoencoders - Week 11

Variational Autoencoders - Week 11

I did some refactoring of the code in the PR for the models repository. I modified the generation scripts and played around with the learned distribution.

The following samples are the result of modifying each latent variable independently out of the 10.

As we can see, only the 3rd, 4th and 9th latent variable affect the generated data.

So, I took two of them, 3rd and 4th, and changed them in two dimensions. Here is how it looks.

The digits which can't be seen here are the ones that are dependent on the 9th latent variable. I tried it. Collectively, these three latent variables can generate all of MNIST.

I also did some cherry-picking and split the huge ReconstructionLoss PR. The NormalDistribution PR will be kept open until we can prove that it works. The BernoulliDistribution PR has been reviewed and needs to be merged for all the PRs.

For CVAEs(Conditional Variational Autoencoders), we will need to use Forward(), Backward() and Update() instead of Train() function as we need to append the labels to the output of the Reparametrization layer midway during the forward pass.

I am now putting a convolutional VAE model to train on the CelebA dataset, I think it's on this dataset that experiments with beta-VAEs and CVAEs can be really interesting.

I had trained a model on binary MNIST as well, here are the results. Sampling from the prior:

Sampling from the posterior:

Neural Collaborative Filtering - Week 10 & 11

Neural Collaborative Filtering - Week 10 & 11

Week 10 went a bit slow due to college reopening and shifting etc and that is why I skipped writing a not so significant blog post. Last week I debugged the whole NCF class code along with ncf_main and it is building successfully now. I have also added support for user choice regarding whether to use implicit or explicit feedback (a normal dataset can be converted to implicit form if necessary). We have decided to move on with two seperate command line executables for cf and ncf since the CLI arguments and the methods to work upon have nothing much in common. There was a slight issue with the CreateNeuMF method which I have been trying to debug and have not been able to identify the source of error.

So this week I intend to debug neumf while parallely testing the implemented GMF and MLP methods. Its finally time to see the results! I have also started writing some proper tests for ncf and intend to finish it within this week. If all goes well we can finally get some comparable results, working CLI and tested NCF code by end of this week. :)

Alternatives to Neighborhood-Based CF - Week 10 & 11

Alternatives to Neighborhood-Based CF - Week 10 & 11

The work to implement BiasSVD and SVD++ is more than I have expected, and this task takes more time than scheduled. For the past two weeks, I first debug the errors in BiasSVD (thanks to Mikhail's careful review), and then finished the implementation of SVDPlusPlus class. After making sure that SVDPlusPlus is working reasonably, I implemented wrapper classes of BiasSVD and SVDPlusPlus which are used by CFType<> as decompostion policy. Then, I refactored CFModel so that the pointer to the actual model class is saved as boost::variant<...>. BiasSVD and SVD++ are not supported in cf executable yet. I will implement the supports after this PR is merged (so that we are sure BiasSVD and SVD++ are working properly). I also spent quite some time in debugging the Travis Test failure with regard to armadillo.

For the next week, I will carry on implementing test cases and complementing code comments. I will also start to work on CF benchmarking if time permits.

Automatically-Generated Go Bindings - Week 11

Automatically-Generated Go Bindings - Week 11

This week has been very exciting, as I managed to generate some C and some Go code for the first time. I first made a CMake file with three sections. The first generates a generate_cpp_method executable who then prints a method.cpp (where method is any mlpack method). The second generates a generate_h_method who then prints a method.h, and lastly, a generate_go_method who then prints a method.go. I have then started to implement the programs that would print the those three files. To do so I have created a GoOption, and some utility files. I then stopped implementing the programs to print out the bindings and got back to doing the handmade perceptron go binding, to handle the passing of model from go to C++ and vice versa.

My goal for next Sunday is to have the the programs that prints the method.cpp, method.h, and method.go files done. It might seem like a lot, but I think that now that everything is printing correctly, implementing those programs by the end of the week is a reasonable task!