Implementation of random experience replay. More...
Public Types | |
using | ActionType = typename EnvironmentType::Action |
Convenient typedef for action. More... | |
using | StateType = typename EnvironmentType::State |
Convenient typedef for state. More... | |
Public Member Functions | |
RandomReplay () | |
RandomReplay (const size_t batchSize, const size_t capacity, const size_t dimension=StateType::dimension) | |
Construct an instance of random experience replay class. More... | |
void | Sample (arma::mat &sampledStates, arma::icolvec &sampledActions, arma::colvec &sampledRewards, arma::mat &sampledNextStates, arma::icolvec &isTerminal) |
Sample some experiences. More... | |
const size_t & | Size () |
Get the number of transitions in the memory. More... | |
void | Store (const StateType &state, ActionType action, double reward, const StateType &nextState, bool isEnd) |
Store the given experience. More... | |
void | Update (arma::mat, arma::icolvec, arma::mat, arma::mat &) |
Update the priorities of transitions and Update the gradients. More... | |
Implementation of random experience replay.
At each time step, interactions between the agent and the environment will be saved to a memory buffer. When necessary, we can simply sample previous experiences from the buffer to train the agent. Typically this would be a random sample and the memory will be a First-In-First-Out buffer.
For more information, see the following.
EnvironmentType | Desired task. |
Definition at line 43 of file random_replay.hpp.
using ActionType = typename EnvironmentType::Action |
Convenient typedef for action.
Definition at line 47 of file random_replay.hpp.
using StateType = typename EnvironmentType::State |
Convenient typedef for state.
Definition at line 50 of file random_replay.hpp.
|
inline |
Definition at line 52 of file random_replay.hpp.
|
inline |
Construct an instance of random experience replay class.
batchSize | Number of examples returned at each sample. |
capacity | Total memory size in terms of number of examples. |
dimension | The dimension of an encoded state. |
Definition at line 66 of file random_replay.hpp.
|
inline |
Sample some experiences.
sampledStates | Sampled encoded states. |
sampledActions | Sampled actions. |
sampledRewards | Sampled rewards. |
sampledNextStates | Sampled encoded next states. |
isTerminal | Indicate whether corresponding next state is terminal state. |
Definition at line 118 of file random_replay.hpp.
|
inline |
Get the number of transitions in the memory.
Definition at line 140 of file random_replay.hpp.
|
inline |
Store the given experience.
state | Given state. |
action | Given action. |
reward | Given reward. |
nextState | Given next state. |
isEnd | Whether next state is terminal state. |
Definition at line 89 of file random_replay.hpp.
|
inline |
Update the priorities of transitions and Update the gradients.
target | The learned value |
sampledActions | Agent's sampled action |
nextActionValues | Agent's next action |
gradients | The model's gradients |
Definition at line 153 of file random_replay.hpp.