[mlpack] CMA-ES and LMCMA, GSOC and Gym

Oleksandr Nikolskyy onikolskyy at gmail.com
Thu Apr 8 08:12:10 EDT 2021


Hi Marcus,

I've found at least one error in the CMA ES implementation in ensmallen:

in line *209* and *214* where *p_sigma* is computed, the choleski factor of
the covariance matrix is used, while the original algorithm takes the
inverse root of the covariance matrix.  (eq *44*, page *29* in the tutorial
https://arxiv.org/pdf/1604.00772.pdf ). What is the best way to compute
A^(-1/2) = BD^(-1)B^T in armadillo?

I've stumbled upon the problem while trying to let cma es from ensmallen
learn CartPole.

Also, I have implemented the LM CMA (https://arxiv.org/abs/1511.00221),
I've tested it on learning cartpole and rosenbrock.
It outperforms CMA ES on rosenbrock in terms of computation time with 1000
params. I'd love to contribute it to ensmallen.

I've tried to train it on the Breakout env, but somehow my computer seems
to have problems, running both the learner and the gym api -- that is also
one thing I need to solve.

Btw I am really sorry to be so late with my proposal. I hope to upload it
tonight, hope it is still OKay.

Best

Oleksandr
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://knife.lugatgt.org/pipermail/mlpack/attachments/20210408/73506a0d/attachment.htm>


More information about the mlpack mailing list