RandomReplay< EnvironmentType > Class Template Reference

Implementation of random experience replay. More...

Public Types

using ActionType = typename EnvironmentType::Action
 Convenient typedef for action. More...

using StateType = typename EnvironmentType::State
 Convenient typedef for state. More...


Public Member Functions

 RandomReplay ()
 RandomReplay (const size_t batchSize, const size_t capacity, const size_t dimension=StateType::dimension)
 Construct an instance of random experience replay class. More...

void Sample (arma::mat &sampledStates, arma::icolvec &sampledActions, arma::colvec &sampledRewards, arma::mat &sampledNextStates, arma::icolvec &isTerminal)
 Sample some experiences. More...

const size_t & Size ()
 Get the number of transitions in the memory. More...

void Store (const StateType &state, ActionType action, double reward, const StateType &nextState, bool isEnd)
 Store the given experience. More...

void Update (arma::mat, arma::icolvec, arma::mat, arma::mat &)
 Update the priorities of transitions and Update the gradients. More...


Detailed Description


class mlpack::rl::RandomReplay< EnvironmentType >

Implementation of random experience replay.

At each time step, interactions between the agent and the environment will be saved to a memory buffer. When necessary, we can simply sample previous experiences from the buffer to train the agent. Typically this would be a random sample and the memory will be a First-In-First-Out buffer.

For more information, see the following.

title = {Reinforcement learning for robots using neural networks},
author = {Lin, Long-Ji},
year = {1993},
school = {Fujitsu Laboratories Ltd}
Template Parameters
EnvironmentTypeDesired task.

Definition at line 43 of file random_replay.hpp.

Member Typedef Documentation

◆ ActionType

using ActionType = typename EnvironmentType::Action

Convenient typedef for action.

Definition at line 47 of file random_replay.hpp.

◆ StateType

using StateType = typename EnvironmentType::State

Convenient typedef for state.

Definition at line 50 of file random_replay.hpp.

Constructor & Destructor Documentation

◆ RandomReplay() [1/2]

RandomReplay ( )

Definition at line 52 of file random_replay.hpp.

◆ RandomReplay() [2/2]

RandomReplay ( const size_t  batchSize,
const size_t  capacity,
const size_t  dimension = StateType::dimension 

Construct an instance of random experience replay class.

batchSizeNumber of examples returned at each sample.
capacityTotal memory size in terms of number of examples.
dimensionThe dimension of an encoded state.

Definition at line 66 of file random_replay.hpp.

Member Function Documentation

◆ Sample()

void Sample ( arma::mat &  sampledStates,
arma::icolvec &  sampledActions,
arma::colvec &  sampledRewards,
arma::mat &  sampledNextStates,
arma::icolvec &  isTerminal 

Sample some experiences.

sampledStatesSampled encoded states.
sampledActionsSampled actions.
sampledRewardsSampled rewards.
sampledNextStatesSampled encoded next states.
isTerminalIndicate whether corresponding next state is terminal state.

Definition at line 118 of file random_replay.hpp.

◆ Size()

const size_t& Size ( )

Get the number of transitions in the memory.

Actual used memory size

Definition at line 140 of file random_replay.hpp.

◆ Store()

void Store ( const StateType state,
ActionType  action,
double  reward,
const StateType nextState,
bool  isEnd 

Store the given experience.

stateGiven state.
actionGiven action.
rewardGiven reward.
nextStateGiven next state.
isEndWhether next state is terminal state.

Definition at line 89 of file random_replay.hpp.

◆ Update()

void Update ( arma::mat  ,
arma::icolvec  ,
arma::mat  ,
arma::mat &   

Update the priorities of transitions and Update the gradients.

targetThe learned value
sampledActionsAgent's sampled action
nextActionValuesAgent's next action
gradientsThe model's gradients

Definition at line 153 of file random_replay.hpp.

The documentation for this class was generated from the following file:
  • /home/jenkins-mlpack/mlpack.org/_src/mlpack-3.2.1/src/mlpack/methods/reinforcement_learning/replay/random_replay.hpp