mlpack IRC logs, 2018-11-07
Logs for the day 2018-11-07 (starts at 0:00 UTC) are shown below.
--- Log opened Wed Nov 07 00:00:04 2018
00:08 -!- cjlcarvalho [~caio@2804:d47:1d0d:a700:95d6:5d67:8b41:79ad] has quit [Ping timeout: 252 seconds]
02:02 -!- cjlcarvalho [~email@example.com] has joined #mlpack
03:35 < davida> zoq: I tried LeakyReLU<> to see if it was better. I think I forgot to switch that back before uploading the file. The result isn't any different. I will try doubling the epochs and post the result in TrainingResults.txt again.
03:36 < davida> What bothers me is that the result is so much worse than the same on Tensorflow. I could understand a little bit of difference but not 50% delta.
05:53 -!- ayesdie [~firstname.lastname@example.org] has joined #mlpack
06:03 -!- ayesdie [~email@example.com] has quit [Quit: ayesdie]
07:10 < zoq> davida: I'll have to take a closer look into the conv layer weight initialization.
07:19 < davida> zoq: I completed the 200 epochs and uploaded the results with the standard ReLU Layer. No improvement at all with longer training. In fact it plateaus after about 50 epochs.
07:26 < davida> BTW I renamed Week1Main.cpp as ConvolutionModelApplication.cpp
07:41 < davida> zoq: As the Python example uses Batch Gradient Descent, I set my parameters on SGD to BatchSize=40 and Iterations=27 (40x27 = 1,080 = nbr of training examples) to give me one complete pass thru' the dataset (assuming my understanding of how this works is correct). Something quite weird happened. Training & Test accuracy did not change for the entire 100 epochs and remained very low at 18.6%%.
09:07 < zoq> davida: Same if you use a step size of 0.01?
09:24 < davida> zoq: checking now - I think I did try multiple values of alpha and there was no real improvement.
09:36 < davida> zoq: Stepsize = 0.01 / Batchsize = 64 / MaxIter = 1000 / Result: Epoch: 95 Training Accuracy = 23.4259% Test Accuracy = 20.8333%
10:27 < zoq> davida: Okay, one more test, what about using RandomInitialization instead of XavierInitialization.
10:28 < zoq> davida: Either way I'll take a closer look into the conv layer.
10:29 < davida> zoq: I did try RandomInitialization already and it made no difference.
10:30 < zoq> davida: Okay, strange.
10:30 < davida> zoq: I have tried multiple different combinations of hyperparameters and cannot get the result to improve beyond about 50% but that requires me setting MaxIterations to 100,000.
10:32 < davida> zoq: This is why I was wondering if I had actually coded my model correctly. In the Python exercise, Andrew Ng is using One Hot vectors, but I am assuming that all that is taken care of by the NegativeLogLikelihood<>
10:33 < davida> It might be a bit easier if we could actually see the cost after each epoch, but I am really not sure how to calculate that.
10:34 < davida> I am using Train & Test accuracy as a proxy.
13:23 -!- cjlcarvalho [~firstname.lastname@example.org] has quit [Ping timeout: 240 seconds]
17:29 -!- vivekp [~vivek@unaffiliated/vivekp] has quit [Read error: Connection reset by peer]
17:33 -!- vivekp [~vivek@unaffiliated/vivekp] has joined #mlpack
20:51 -!- vivekp [~vivek@unaffiliated/vivekp] has quit [Ping timeout: 252 seconds]
20:51 -!- vivekp [~vivek@unaffiliated/vivekp] has joined #mlpack
21:07 < zoq> davida: Do you think you could save and upload the train/test data as arma_binary ( trainSetX.save("A.bin"); )?
--- Log closed Thu Nov 08 00:00:06 2018