Hands-On Reinforcement Learning with Python
上QQ阅读APP看书,第一时间看更新

Policy function

A policy defines the agent's behavior in an environment. The way in which the agent decides which action to perform depends on the policy. Say you want to reach your office from home; there will be different routes to reach your office, and some routes are shortcuts, while some routes are long. These routes are called policies because they represent the way in which we choose to perform an action to reach our goal. A policy is often denoted by the symbol 𝛑. A policy can be in the form of a lookup table or a complex search process.