Abstract
Most quantities of interest in discounted and undiscounted (semi-) Markov decision processes can be obtained by solving a system of functional equations. This paper derives bounds and variational characterizations for the solutions of such systems. These are useful for at least three reasons: (1) in any solution procedure the upper and lower bounds can be used to measure the deviation of the current solution from optimality; (2) this in turn may permit elimination of suboptimal actions; and (3) the variational characterizations suggest numerical algorithms (linear programming, policy iteration algorithms, successive approximation schemes).
Full Citation
Federgruen, Awi and Paul Schweitzer. “Variational characterizations in Markov decision processes.”
Journal of Mathematical Analysis and Applications
vol. 117,
(August 01, 1986): 326-357.