PrioritizedReplay< EnvironmentType > Class Template Reference

Implementation of prioritized experience replay. More...

Public Types

using ActionType = typename EnvironmentType::Action
 Convenient typedef for action. More...

 
using StateType = typename EnvironmentType::State
 Convenient typedef for state. More...

 

Public Member Functions

 PrioritizedReplay ()
 Default constructor. More...

 
 PrioritizedReplay (const size_t batchSize, const size_t capacity, const double alpha, const size_t dimension=StateType::dimension)
 Construct an instance of prioritized experience replay class. More...

 
void BetaAnneal ()
 Annealing the beta. More...

 
void Sample (arma::mat &sampledStates, arma::icolvec &sampledActions, arma::colvec &sampledRewards, arma::mat &sampledNextStates, arma::icolvec &isTerminal)
 Sample some experience according to their priorities. More...

 
arma::ucolvec SampleProportional ()
 Sample some experience according to their priorities. More...

 
const size_t & Size ()
 Get the number of transitions in the memory. More...

 
void Store (const StateType &state, ActionType action, double reward, const StateType &nextState, bool isEnd)
 Store the given experience and set the priorities for the given experience. More...

 
void Update (arma::mat target, arma::icolvec sampledActions, arma::mat nextActionValues, arma::mat &gradients)
 Update the priorities of transitions and Update the gradients. More...

 
void UpdatePriorities (arma::ucolvec &indices, arma::colvec &priorities)
 Update priorities of sampled transitions. More...

 

Detailed Description


template
<
typename
EnvironmentType
>

class mlpack::rl::PrioritizedReplay< EnvironmentType >

Implementation of prioritized experience replay.

Prioritized experience replay can replay important transitions more frequently by prioritizing transitions, and make agent learn more efficiently.

@article{schaul2015prioritized,
title = {Prioritized experience replay},
author = {Schaul, Tom and Quan, John and Antonoglou,
Ioannis and Silver, David},
journal = {arXiv preprint arXiv:1511.05952},
year = {2015}
}
Template Parameters
EnvironmentTypeDesired task.

Definition at line 39 of file prioritized_replay.hpp.

Member Typedef Documentation

◆ ActionType

using ActionType = typename EnvironmentType::Action

Convenient typedef for action.

Definition at line 43 of file prioritized_replay.hpp.

◆ StateType

using StateType = typename EnvironmentType::State

Convenient typedef for state.

Definition at line 46 of file prioritized_replay.hpp.

Constructor & Destructor Documentation

◆ PrioritizedReplay() [1/2]

PrioritizedReplay ( )
inline

Default constructor.

Definition at line 51 of file prioritized_replay.hpp.

◆ PrioritizedReplay() [2/2]

PrioritizedReplay ( const size_t  batchSize,
const size_t  capacity,
const double  alpha,
const size_t  dimension = StateType::dimension 
)
inline

Construct an instance of prioritized experience replay class.

Parameters
batchSizeNumber of examples returned at each sample.
capacityTotal memory size in terms of number of examples.
alphaHow much prioritization is used.
dimensionThe dimension of an encoded state.

Definition at line 62 of file prioritized_replay.hpp.

Member Function Documentation

◆ BetaAnneal()

void BetaAnneal ( )
inline

Annealing the beta.

Definition at line 203 of file prioritized_replay.hpp.

Referenced by PrioritizedReplay< EnvironmentType >::Sample().

◆ Sample()

void Sample ( arma::mat &  sampledStates,
arma::icolvec &  sampledActions,
arma::colvec &  sampledRewards,
arma::mat &  sampledNextStates,
arma::icolvec &  isTerminal 
)
inline

Sample some experience according to their priorities.

Parameters
sampledStatesSampled encoded states.
sampledActionsSampled actions.
sampledRewardsSampled rewards.
sampledNextStatesSampled encoded next states.
isTerminalIndicate whether corresponding next state is terminal state.

Definition at line 149 of file prioritized_replay.hpp.

References PrioritizedReplay< EnvironmentType >::BetaAnneal(), SumTree< T >::Get(), PrioritizedReplay< EnvironmentType >::SampleProportional(), and SumTree< T >::Sum().

◆ SampleProportional()

arma::ucolvec SampleProportional ( )
inline

Sample some experience according to their priorities.

Returns
The indices to be chosen.

Definition at line 126 of file prioritized_replay.hpp.

References SumTree< T >::FindPrefixSum(), and SumTree< T >::Sum().

Referenced by PrioritizedReplay< EnvironmentType >::Sample().

◆ Size()

const size_t& Size ( )
inline

Get the number of transitions in the memory.

Returns
Actual used memory size.

Definition at line 195 of file prioritized_replay.hpp.

◆ Store()

void Store ( const StateType state,
ActionType  action,
double  reward,
const StateType nextState,
bool  isEnd 
)
inline

Store the given experience and set the priorities for the given experience.

Parameters
stateGiven state.
actionGiven action.
rewardGiven reward.
nextStateGiven next state.
isEndWhether next state is terminal state.

Definition at line 99 of file prioritized_replay.hpp.

References SumTree< T >::Set().

◆ Update()

void Update ( arma::mat  target,
arma::icolvec  sampledActions,
arma::mat  nextActionValues,
arma::mat &  gradients 
)
inline

Update the priorities of transitions and Update the gradients.

Parameters
targetThe learned value.
sampledActionsAgent's sampled action.
nextActionValuesAgent's next action.
gradientsThe model's gradients.

Definition at line 216 of file prioritized_replay.hpp.

References PrioritizedReplay< EnvironmentType >::UpdatePriorities().

◆ UpdatePriorities()

void UpdatePriorities ( arma::ucolvec &  indices,
arma::colvec &  priorities 
)
inline

Update priorities of sampled transitions.

Parameters
indicesThe indices of sample to be updated.
prioritiesTheir corresponding priorities.

Definition at line 183 of file prioritized_replay.hpp.

References SumTree< T >::BatchUpdate().

Referenced by PrioritizedReplay< EnvironmentType >::Update().


The documentation for this class was generated from the following file:
  • /home/jenkins-mlpack/mlpack.org/_src/mlpack-3.2.1/src/mlpack/methods/reinforcement_learning/replay/prioritized_replay.hpp