Deep Learning Module in MLpack(Week 10)

Week Ten

The past week has seen some good progress. I was finally able to finish the ssRBM & binary RBM PR. I also made some progress on fixing the GAN network. Mostly I just cleaned up the code for the GAN network and just added a diff rent initialization strategy( Insitalising weights based on per layer basis). This actually fixed the error with the vanishing gradients that we were experincning with the GAN PR. I also a added a simpler test to check our implementation of this was based upon the Edwin Chen's blog and the test that Goodfellow's original paper is based on. Though the test meant more for showing that the generated outputs are very close to the real data.

The goal for the next week is mainly finishing up with the GAN PR. The major problem as pointed out by Mikhail is that of the CreateBatch function. I think that needs refactoring.

more...

Deep Learning Module in MLpack(Week 7)

Week Seven

This week i mostly tried completing ssRBM and GAN PR's. Majority of the time was spend in making both the codes work on the test dataset. We finally managed to do so. With ssRBM Pr we were running into the errors of memory mangament due to me allocating around ~30gb of memory for the parameters. Since i was declaring all the parameters to be of full matricies. But i managed to reduce this to just vectors. The problem left withh ssRBM still is the training part we getting a accuracy of around 12% for the mnist data set that we used in the binary rbm. We are working on fixing the test.

This week i also managed to finish the GAN implmenetation the code on work on the test data but is givin near random images even for say 1000 iterations of alternating sgd(with the discriminator being trained for 3000(3 * 1000) iterations) and generator being trained for 1000 iterations(the generator and discriminator being used here are just simple ffn). The GAN PR also requires review for me to fully undertand where i am gouing wrong. I want to thank Konstantin here also since i was using the CrossEntropy Code that he wrote for the GAN's. I am also not sure how to test GAN. Write now i am just trying to see if it can generate real quality images.

Next Week: I would mostly be working on fixing both the GAN' and ssRBM test. Also i would write serialisation for GAN's next week. I hoping within 10 days both PR's would be mergable.

more...

Deep Learning Module in MLpack(Week 5)

Week Five

This week was primarily focused on reading and understanding the ssRBM paper. I also opened a new PR for ssRBM that basically implements the spike slab layer. Our approach for implementation of ssRBM is that it will be a policy class of the RBM class. This would mean that we would have very less code duplication and things we would actually need to implement for the ssRBM would be Gradients, Reset and the FreeEnergy function. I have already implemented these the only part remaining is refactoring of the RBM class.

The plan for the next week is basically to complete the ssRBM implementation and hopefully test the ssRBM implementation for classification on the mnist/cifar dataset. I don't exactly know how would we add these to the repo as cifar dataset is huge.

more...

Deep Learning Module in MLpack(Week 4)

Week Four

This week we mainly put finishing touches upon out existing Binary RBM PR. The finishing touches took time mainly because we were not able to train to the RBM correctly and one disastrous commit that I submitted that actually rolled back the changes that I had made earlier :(. We did a lot of trial and error's(mainly with the gibbs sampling step) to make it finally work. I do now undertand why people in Deep Learning talk so much about how hard it is to train DL models.

Here are is our Result on the mnist Dataset

The samples are generated from 1000 steps gibbs sampling

This is image is generated from deeplearning.net example

We also added another test which basically a classification test using the latent learnt representation of mnist dataset. We compared our results with sci-kit learn framework. And we were able to get better accuracy than them on the subset of test cases. Though I think it would be a fair comparison if we do a 10-fold cross validation. Right now the test size - 100 and train size 2500 our implementation classification accuracy is around 90% while sklearn is around 78% only(sklearn implementation the number of steps gibbs=1). We have also added a serialization test to out implementation this week.

I think the PR would be accepted this week(fingers crossed).

I have also started working on the ssRBM. It was disappointing to see that no other libraries have actually implemented the ssRBM so we could compare our results with them. Even the authors do not provide a link to the code. Anyways, I have implemented the spike-slab layer(hidden) layer and visible layer for the ssRBM and would be opening a PR by this weekend.

The main goals for next week are the following 1. Implement ssRBM 2. Start writing tests for ssRBM.

PS. I would like to thank Mikhail for all the help this week :)

more...