Notes on the Derivation of Least Squares Policy Iteration

Notes on the Derivation of Least Squares Policy Iteration

Here are my notes on the derivation of the Least Squares Policy Iteration (LSPI) algorithm. The notes are based on the original paper by Lagoudakis and Parr.

/ [pdf]

Feedback