CPS196 - Fall 1999
Markov Decision Processes
Background: Value functions, or cost-to-go functions, estimate the
benefit of states in terms of some reward measure. They are used
often in optimal control and learning, as we'll see.
This chapter describes value iteration and policy iteration, which are
schemes for computing optimal value functions. The find a value
function obtained from maximizing expected reward.
Questions:
-
-
- What is an advantage of value functions? Disadvantage?
Offline: PROJECT
Background: INFO
Questions:
Notes
Modified: Thu Aug 26 15:56:47 EDT 1999
by Michael Littman, mlittman@cs.duke.edu