Danger
The whole chapter is very early work in progress.
Dynamic Programming
The algorithms that we are going to cover in this section are known as dynamic programming. Dynamic programming is not commonly used to solve reinforcement learning tasks. In fact there is no learning involved at all. Instead of interacting with the environment to find an optimal policy, dynamic programming utilizes planning.
Info
Planning utilizes a model of the environment to improve a policy.
Dynamic programming requires the full knowledge of the model of the environment and calculates the optimal value function and optimal policy through the knowledge of that model. The interaction between the agent and the environment is not necessary. While the access to the model of the environment is an unrealistic assumtion, the knowledge tha you will gain by studying dynamic programming algorithms is transferable to reinforcement learning.