A Markov decision process (MDP) has:
- A finite set of states S.
- A finite set of actions A.
- A transition function T(s,a,s') for all
,
(
).
- A reward function R(s,a) for all
,
.
Next: Bellman Equation for MDPs
Up: Markov Decision Processes
Previous: Single Agent Problems