mlpack IRC logs, 2018-06-22

Logs for the day 2018-06-22 (starts at 0:00 UTC) are shown below.

June 2018
Sun
Mon
Tue
Wed
Thu
Fri
Sat
 
 
 
 
 
1
2
3
4
5
6
7
8
16
17
18
19
20
21
22
30
--- Log opened Fri Jun 22 00:00:14 2018
04:28 < jenkins-mlpack> Project docker mlpack weekly build build #47: STILL UNSTABLE in 3 hr 41 min: http://masterblaster.mlpack.org/job/docker%20mlpack%20weekly%20build/47/
04:28 < jenkins-mlpack> * wenhao.huang.work: add cf no_normalization
04:28 < jenkins-mlpack> * wenhao.huang.work: normalizationType constructor
04:28 < jenkins-mlpack> * wenhao.huang.work: small bugfix
04:28 < jenkins-mlpack> * wenhao.huang.work: add normalization cmakelist
04:28 < jenkins-mlpack> * wenhao.huang.work: modify cf files
04:28 < jenkins-mlpack> * wenhao.huang.work: change CF to CF<>
04:28 < jenkins-mlpack> * wenhao.huang.work: add normalization to cmakelist
04:28 < jenkins-mlpack> * wenhao.huang.work: update comments
04:28 < jenkins-mlpack> * wenhao.huang.work: style fix
04:28 < jenkins-mlpack> * wenhao.huang.work: bug fix
04:28 < jenkins-mlpack> * wenhao.huang.work: add overall mean normalization
04:28 < jenkins-mlpack> * wenhao.huang.work: add user/item normalization
04:28 < jenkins-mlpack> * wenhao.huang.work: bugfix
04:28 < jenkins-mlpack> * wenhao.huang.work: add z-score normalization
04:28 < jenkins-mlpack> * wenhao.huang.work: add combined normalization
04:28 < jenkins-mlpack> * wenhao.huang.work: update comments
04:28 < jenkins-mlpack> * wenhao.huang.work: very small style fix
04:28 < jenkins-mlpack> * wenhao.huang.work: update comments & debug
04:28 < jenkins-mlpack> * wenhao.huang.work: remove if(cleanData) block
04:28 < jenkins-mlpack> * haritha1313: advanced ctor
04:28 < jenkins-mlpack> * wenhao.huang.work: use complete sentences for examples
04:28 < jenkins-mlpack> * wenhao.huang.work: change method names to Mean() and return const refenrence
04:28 < jenkins-mlpack> * wenhao.huang.work: change param specification
04:28 < jenkins-mlpack> * wenhao.huang.work: new line for brace
04:28 < jenkins-mlpack> * wenhao.huang.work: templatize Normalize() functions in some classes
04:28 < jenkins-mlpack> * haritha1313: tests edit
04:28 < jenkins-mlpack> * haritha1313: style edits
04:28 < jenkins-mlpack> * wenhao.huang.work: initialize members in constructor
04:28 < jenkins-mlpack> * wenhao.huang.work: style
04:28 < jenkins-mlpack> * wenhao.huang.work: style fix
04:28 < jenkins-mlpack> * wenhao.huang.work: Denormalize(users(i), ...)
04:28 < jenkins-mlpack> * wenhao.huang.work: change from arma::vec userMean to arma::rowvec userMean
04:28 < jenkins-mlpack> * Marcus Edel: Minor style fixes and add serialization.
05:25 -!- travis-ci [~travis-ci@ec2-54-227-123-80.compute-1.amazonaws.com] has joined #mlpack
05:25 < travis-ci> manish7294/mlpack#31 (lmnn - 1b571f2 : Manish): The build has errored.
05:25 < travis-ci> Change view : https://github.com/manish7294/mlpack/compare/620ee5987fa2...1b571f2b5cf2
05:25 < travis-ci> Build details : https://travis-ci.com/manish7294/mlpack/builds/77022112
05:25 -!- travis-ci [~travis-ci@ec2-54-227-123-80.compute-1.amazonaws.com] has left #mlpack []
07:08 -!- travis-ci [~travis-ci@ec2-54-145-114-141.compute-1.amazonaws.com] has joined #mlpack
07:08 < travis-ci> manish7294/mlpack#32 (lmnn - 1f08582 : Manish): The build has errored.
07:08 < travis-ci> Change view : https://github.com/manish7294/mlpack/compare/1b571f2b5cf2...1f0858230c37
07:08 < travis-ci> Build details : https://travis-ci.com/manish7294/mlpack/builds/77026701
07:08 -!- travis-ci [~travis-ci@ec2-54-145-114-141.compute-1.amazonaws.com] has left #mlpack []
07:17 < Atharva> zoq: rcurtin: The NormalDistribution class has become very specifiv to the ann module. So Sumedh and I were thinking about moving it in the ann module under a new folder dists. Is that okay? Or should we keep it in the core/dists folder?
08:55 < zoq> Atharva: In this case I would put the class into the ann folder.
08:56 < Atharva> zoq: Okay, so should I create a new dists folder under the ann folder?
08:57 < zoq> Atharva: I think that is a good idea.
08:57 < ShikharJ> zoq: Do you have any further comments for the DCGAN API? That PR is required for WGAN.
08:57 < Atharva> zoq: Okay, I will go ahead with it.
08:58 < zoq> ShikharJ: No, let me hit the merge button.
08:58 < ShikharJ> zoq: Sorry for bothering you again and again regarding this.
08:59 < zoq> ShikharJ: No worries, just wanted to get the test time down :)
09:20 < ShikharJ> zoq: For the next two weeks, I'll be focussing on the WGAN PR, Weight Clipping methods for WGAN and completing the Dual Optimizer PR. Please let me know if it's fine?
09:56 -!- travis-ci [~travis-ci@ec2-54-211-251-47.compute-1.amazonaws.com] has joined #mlpack
09:56 < travis-ci> mlpack/mlpack#5130 (master - 6fd5e52 : Marcus Edel): The build was broken.
09:56 < travis-ci> Change view : https://github.com/mlpack/mlpack/compare/0cf310abf1c2...6fd5e527b54b
09:56 < travis-ci> Build details : https://travis-ci.org/mlpack/mlpack/builds/395379647
09:56 -!- travis-ci [~travis-ci@ec2-54-211-251-47.compute-1.amazonaws.com] has left #mlpack []
10:07 < jenkins-mlpack> Project docker mlpack nightly build build #357: STILL UNSTABLE in 2 hr 53 min: http://masterblaster.mlpack.org/job/docker%20mlpack%20nightly%20build/357/
10:15 -!- travis-ci [~travis-ci@ec2-54-211-232-50.compute-1.amazonaws.com] has joined #mlpack
10:15 < travis-ci> manish7294/mlpack#33 (lmnn - 104fb2a : Manish): The build has errored.
10:15 < travis-ci> Change view : https://github.com/manish7294/mlpack/compare/1f0858230c37...104fb2aa1ede
10:15 < travis-ci> Build details : https://travis-ci.com/manish7294/mlpack/builds/77042568
10:15 -!- travis-ci [~travis-ci@ec2-54-211-232-50.compute-1.amazonaws.com] has left #mlpack []
11:02 -!- wenhao [731bc2ed@gateway/web/freenode/ip.115.27.194.237] has joined #mlpack
11:20 < wenhao> zoq, rcurtin : I'm sorry for the late reply. I am a bit stuck with programming with opencl for my term project this week :(
11:23 < wenhao> zoq: For different search policies, as they use different metrics for choosing neighbors and calculating the distance, I think the resulting neighbors and similarities are different
11:26 < wenhao> zoq: And by "accumulate the results over multiple runs", did you mean running the algorithm with different seeds ?
11:50 < wenhao> rcurtin: Yes that will be useful. I didn't know that LSH is an alternative search method in mlpack. Thanks for the advice:)
12:00 < wenhao> lozhnikov: Hi Mikhail. I am thinking about the how to implement BiasSVD and SVD++. One issue is that they are not based on matrix factorization in the form of V = W * H, so I might have to refactor the CFTye<...> class template to allow for the implementation of BiasSVD and SVD++ models.
12:04 < wenhao> One of my ideas is to rename the current `CFType<>` to something like `CFMatrixDecompositionModel` which is used specially for cf algorithm based on matrix factorization. And then we can add a wrapper class CFType<ModelType> with interfaces including
12:04 < wenhao> `Predict` `GetRecommendations` etc..
12:05 < wenhao> In this way we can easily add models based on methods other than matrix factorization
12:07 < wenhao> I'm sure whether it's the best way to implement the new models. Any idea or suggestion would be helpful!
12:30 -!- wenhao [731bc2ed@gateway/web/freenode/ip.115.27.194.237] has quit [Ping timeout: 260 seconds]
12:53 < ShikharJ> rcurtin: Are you there?
13:13 < zoq> ShikharJ: Sounds like a good plan to me.
13:14 < ShikharJ> If we're able to pull that off, we'll be over with our main goals by Phase II. Then we can focus on the other pending PR by Kris on RBMs.
13:18 < zoq> Right, plenty of time for some cool experiments too.
13:47 -!- wenhao [731bc2ed@gateway/web/freenode/ip.115.27.194.237] has joined #mlpack
14:04 -!- travis-ci [~travis-ci@ec2-54-167-20-109.compute-1.amazonaws.com] has joined #mlpack
14:04 < travis-ci> manish7294/mlpack#34 (lmnn - 58b17e2 : Manish): The build has errored.
14:04 < travis-ci> Change view : https://github.com/manish7294/mlpack/compare/104fb2aa1ede...58b17e27284a
14:04 < travis-ci> Build details : https://travis-ci.com/manish7294/mlpack/builds/77063148
14:04 -!- travis-ci [~travis-ci@ec2-54-167-20-109.compute-1.amazonaws.com] has left #mlpack []
14:20 < wenhao> just found a typo in my previous message: I'm *not* sure whether it's the best way to implement the new models.
14:50 -!- ImQ009 [~ImQ009@unaffiliated/imq009] has joined #mlpack
14:55 < rcurtin> ShikharJ: I'm here now, sorry---I slept a little late today
15:10 -!- travis-ci [~travis-ci@ec2-54-166-198-235.compute-1.amazonaws.com] has joined #mlpack
15:10 < travis-ci> manish7294/mlpack#35 (lmnn - 4321cb7 : Manish Kumar): The build passed.
15:10 < travis-ci> Change view : https://github.com/manish7294/mlpack/compare/58b17e27284a...4321cb7f10c4
15:10 < travis-ci> Build details : https://travis-ci.com/manish7294/mlpack/builds/77063351
15:10 -!- travis-ci [~travis-ci@ec2-54-166-198-235.compute-1.amazonaws.com] has left #mlpack []
15:33 -!- manish7294 [~yaaic@2405:205:2480:faee:b86a:d44c:eb9c:22bf] has joined #mlpack
15:36 < manish7294> rcurtin: I tried changing constraints to class member but somehow things broke up, If you could have a quick look over the commit - "update documentation". I think that could help a lot.
15:38 < ShikharJ> rcurtin: No worries, I overcame the issue I was facing.
15:39 -!- wenhao [731bc2ed@gateway/web/freenode/ip.115.27.194.237] has quit [Ping timeout: 260 seconds]
15:41 < rcurtin> ShikharJ: ok, sounds good
15:41 < rcurtin> manish7294: ok, let me take a look...
15:42 < ShikharJ> rcurtin: Though I was wondering how you came up the name ratml for you personal domain (more specifically Rage Against The Machine Learning)?
15:43 < rcurtin> ShikharJ: it was a joke based on 'rage against the machine' by a friend... so I can't claim credit myself
15:43 < rcurtin> but since he did not work in machine learning I stole it :)
15:43 < rcurtin> manish7294: I see in the commit that you kept `dataset` as a member of the Constraints class, I was thinking that maybe you could remove that and take `dataset` as a parameter to Impostors() or TargetNeighbors()
15:43 < ShikharJ> rcurtin: The rock band 'Rage Against The Machine'?
15:43 < rcurtin> yeah
15:44 < ShikharJ> haha
15:46 < manish7294> rcurtin: Okay will try that, thanks!
15:47 < rcurtin> right, I see that the commit you sent failed the test. I'm not sure why, but if you want to try the approach I suggested, we can try and find a bug once it's implemented (if there is one)
15:49 < manish7294> Ya, on doing that optimization process was disbalanced and the results were extremely poor
15:54 < rcurtin> right, now ideally it should not have changed the results at all
15:56 -!- manish72942 [~yaaic@2405:205:2526:d7d8:b86a:d44c:eb9c:22bf] has joined #mlpack
15:56 < manish72942> Ya, that seems starnge
15:57 < manish72942> *strange
15:58 -!- manish7294 [~yaaic@2405:205:2480:faee:b86a:d44c:eb9c:22bf] has quit [Ping timeout: 245 seconds]
15:58 < manish72942> maybe I missed something while making that change
16:08 < rcurtin> yeah, it's possible that 'dataset' was being used somewhere where 'transformedDataset' should have been used
16:08 < rcurtin> that would be my first guess
16:08 < rcurtin> but let's see what happens with the refactoring and then dig in if we need
16:22 < zoq> wenhao: Ideally, we use the same conditions for each, but not sure that's easy enough at this point, so using different seeds might be a first test to see drifts in the results.
16:25 < zoq> wenhao: I'm not sure I see the reason for another class, can you clarifiy that point?
16:53 < manish72942> rcurtin: Looks like I found the issue, If you see the shuffle() in lmnn_function_impl.hpp , the labels are shuffled too leaving the precalculate part of Constraints absurd and hence the poor results. So, we may have to call Precalculate() again on shuffle() call.
16:55 < manish72942> I will make the require changes by tomorrow and then we can have it merge.
17:13 < rcurtin> manish72942: right, I guess maybe we have to pass the shuffled labels also then
18:09 < manish72942> Maybe now we can use the Dataset() and a Label() and just make a call to Precalculate during shuffle(), this way we can avoid changing much of the structure. does it sounds reasonable?
18:21 < rcurtin> manish72942: the code you sent earlier ended up making a copy of the dataset each time Dataset() was called
18:21 < rcurtin> so even if it is a little more work I think it's better to refactor so that the dataset and la els are being passed as reference-type parameters
18:22 < manish72942> sure, thanks for explaining :)
18:23 < rcurtin> labels* not la els :)
19:53 < ShikharJ> zoq: Sorry for reaching out a little late. Are you there?
19:56 < zoq> ShikharJ: I'm here.
19:57 < ShikharJ> zoq: I was looking out to making the DualOptimizer API as close to the current API of optimizers, but it seems I'll have to pass some additional parameters to make it work.
19:58 < ShikharJ> For example, in a single optimizer, in the optimize step, we run the routine over the entire set of parameters.
19:59 < ShikharJ> But in the case of GANs, we'll have to train the generator separately till the genWeights parameters and the discriminator from thereon till the end.
20:00 < ShikharJ> I mean the index of the submatrix would be from 0 to genWeights - 1 and genWeights to parameters.n_elem - 1, or something like that.
20:02 < ShikharJ> zoq: Do you think we should add the genWeights parameter in the Optimize() function in the dual Optimizer class, or should we instead pass this in the constructor itself?
20:03 < zoq> ShikharJ: hm, how we update the parameter is function (GAN, logistic regression, etc.) specific, so in the GAN function we could select the correct index.
20:04 < zoq> The issue I see is, that we allocate unnecessary/unused memory for the optimization process.
20:07 < zoq> So what you propose is to pass the weights for the two functions right?
20:08 < ShikharJ> zoq: Just the indices of the weight boundaries, but I'm not sure if this should happen inside Optimize() function which has a set number of arguments for most optimizers, or this should be done in the constructor.
20:09 -!- manish72942 [~yaaic@2405:205:2526:d7d8:b86a:d44c:eb9c:22bf] has quit [Ping timeout: 240 seconds]
20:12 < ShikharJ> Basically whether the function signature should be changed here (https://github.com/mlpack/mlpack/pull/1437/files#diff-64799002b4618155c00cc66b45610c74R38) or here (https://github.com/mlpack/mlpack/pull/1437/files#diff-64799002b4618155c00cc66b45610c74R31).
20:17 < zoq> Don't see any issues with the constructor or optimizer method, in either case we should set a default parameter, which uses the full set for both, that way, it can be used for the existing methods.
20:18 < ShikharJ> zoq: Consequently, I also believe that we'll need to have two separate Gradient and Evaluate functions for the two networks right?
20:18 < ShikharJ> Because they have two separate optimizers?
20:19 < zoq> You are talking about the GAN class right?
20:19 < ShikharJ> Yes
20:25 < zoq> Because the input for the one is generated by another network, I think you are right.
20:25 < zoq> What about passing a single function (GAN class), that handles the specifics inside the class itself.
20:26 < zoq> I guess that is basically what we have right now.
20:26 < zoq> Just that we have this dual optimizer class with an additional bounds information.
20:27 < zoq> If we implement a specific Evaluate/Gradient function, the optimizer is very GAN specific.
20:30 < ShikharJ> What I have trouble visualizing is that since we have only a single Evaluate/Gradient function inside the GAN class, when we shift to two of each of them, how would the individual optimizers know which ones to refer to
20:31 < zoq> ohh, you are right I missed that point
20:34 < zoq> I guess in thise case there is no easy way around the dual function approach
20:38 < zoq> What do you think if we specialize the DualOptimizer class itself for the GAN class? That way we could overwrite the Optimize function and singal once the other network is trained?
20:39 < ShikharJ> zoq: Can we try exploring a template-enabled dual function implementation. Maybe we define two evaluate functions, but we use enable_if to check for the required function (it might not be as easy as I'm guessing it to be but we can explore this).
20:41 < ShikharJ> Or maybe we can also try what you mentioned above.
20:42 < zoq> I was talking about something like: https://github.com/mlpack/mlpack/blob/0068a53a4d752d9553c08acc752898cf393a4f12/src/mlpack/methods/regularized_svd/regularized_svd_function.hpp
20:42 < zoq> line 149
20:43 < zoq> We could also see if we are able to use some template functions like enable_if as you suggested.
20:44 < ShikharJ> zoq: Ah I see, that can certainly be explored. I'll let you know what I find. I'll explore these two options for now.
20:47 < zoq> ShikharJ: I'll see if I can think of anything else.
21:30 -!- ImQ009 [~ImQ009@unaffiliated/imq009] has quit [Quit: Leaving]
--- Log closed Sat Jun 23 00:00:15 2018