Hands-On Reinforcement Learning with Python
上QQ阅读APP看书,第一时间看更新

Value function

A value function denotes how good it is for an agent to be in a particular state. It is dependent on the policy and is often denoted by v(s). It is equal to the total expected reward received by the agent starting from the initial state. There can be several value functions; the optimal value function is the one that has the highest value for all the states compared to other value functions. Similarly, an optimal policy is the one that has the optimal value function.