Multiple Hits

How can we express the value of the optimal strategy then?

Like in the Markov chain case, we can express the optimal value for a state in terms of the optimal value of other states:

\begin{displaymath}
v^d(p) = \max\left(r^{p}(d), \sum_{p'} \Pr(p'\vert p) v^d(p')\right).\end{displaymath}

Equations like this are often called ``Bellman Equations'' after the man who brought them to prominence.

This is just like the Markov chain-style equation, except with a max in it.


next up previous
Next: Computing the Optimal Strategy Up: BLACKJACK Previous: Optimal Strategy