强化学习的数学原理
0. Preview
1. Basic Concepts
1.1 Markov Decision Process (MDP)
2. State Value and Bellman Equation
2.1 State Value
2.2 Bellman Equation
Last updated
