[mlpack] [mlpack/mlpack] adds GammaDistribution::Train(observations, probabilities) (#834)

Yashu Seth notifications at github.com
Fri Dec 23 02:05:10 EST 2016


yashu-seth commented on this pull request.



> +  for(size_t i = 0; i < N; i++)
+    probabilities(i) = prob(generator);
+
+  // fit results with probabilities and data
+  GammaDistribution gDist;
+  gDist.Train(rdata, probabilities);
+
+  // fit results with only data
+  GammaDistribution gDist2;
+  gDist2.Train(rdata);
+
+  BOOST_REQUIRE_CLOSE(gDist2.Alpha(0), gDist.Alpha(0), 10);
+  BOOST_REQUIRE_CLOSE(gDist2.Beta(0), gDist.Beta(0), 10);
+
+  BOOST_REQUIRE_CLOSE(alphaReal, gDist.Alpha(0), 10);
+  BOOST_REQUIRE_CLOSE(betaReal, gDist.Beta(0), 10);

@rcurtin I checked the implementation and could not find anything incorrect. 

But I think, it is not necessary gDist and gDist2 should always have a difference less than 1e-5 when we are sampling probabilities from a unifrom distribution between 0 and 1. For example, there would be some data points that would get assigned a very low probability and would be treated as if they are not present at all. So when we train gDist and gDist2 they don't get the exact same data to be trained on.

Something in favour of my argument, can be the fact that when I train the distribution with all probabilities equal to 1, I get the exact same distribution when I train it with just the data.

-- 
You are receiving this because you commented.
Reply to this email directly or view it on GitHub:
https://github.com/mlpack/mlpack/pull/834
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://knife.lugatgt.org/pipermail/mlpack/attachments/20161222/cdd24629/attachment.html>


More information about the mlpack mailing list