[mlpack] Implementation of matlab ksdensity function
Ryan Curtin
ryan at ratml.org
Tue Feb 13 15:54:26 EST 2018
On Tue, Feb 13, 2018 at 02:44:03PM +0000, Angelo DI SENA wrote:
> Hi Ryan
>
> Thanks for your answer.
> Hi partially understood your suggestion.
> This is due to my poor knowledge of the math behind.
>
> In the mathlab script I'm trying to convert
> Vector is 3000 element (1x3000)
> With values between -1 and 1
> Pts is a 200 vector(200X1)
>
> From matlab documentation the result should be 200 pair of values (one for each element in pts)
> So, what is not clear is how I should consider vector.
> For each value in pts which values must be considered from vector?
Hi Angelo,
No problem, I am happy to try to help out. I can explain basic kernel
density estimation; however, you should double-check the MATLAB
implementation and make sure you change my description below to fit what
they are actually doing. For instance, I think that ksdensity() does
auto-tune the bandwidth of the kernel, but my discussion below will
assume a hand-chosen bandwidth.
When you do kernel density estimation, you are assuming that your
density f(x) can be modeled by a sum of the points:
f(x) = sum_{i = 0}^{n} K(x, p_i, bw)
where { p_0, ..., p_n } are the reference points (called 'vector' in
your code, containing 3000 one-dimensional points), 'x' is the query
point (one element of 'pts' in your code), and 'bw' is a bandwidth for
the density estimation.
The kernel function, if you choose a Gaussian function, is just
K(x, p_i, bw) = exp(-| x - p_i |^2 / (2 * bw^2)),
so you can use GaussianDistribution for that part.
Since you want results for each point in 'pts', you can just repeat that
f(x) calculation for each point in 'pts'.
I hope this is helpful... let me know if I can clarify anything.
Thanks!
Ryan
--
Ryan Curtin | "For more enjoyment and greater efficiency,
ryan at ratml.org | consumption is being standardized."
More information about the mlpack
mailing list