Characterization of the Optimal Risk-Sensitive Average Cost in Denumerable Markov Decision Chains

成果类型:
Article
署名作者:
Cavazos-Cadena, Rolando
刊物名称:
MATHEMATICS OF OPERATIONS RESEARCH
ISSN/ISSBN:
0364-765X
DOI:
10.1287/moor.2017.0893
发表日期:
2018
页码:
1025-1050
关键词:
infinite-horizon risk STATE-SPACE portfolio SYSTEM
摘要:
This work is concerned with Markov decision chains on a denumerable state space. The controller has a positive risk-sensitivity coefficient, and the performance of a control policy is measured by a risk-sensitive average cost criterion. Besides standard continuity-compactness conditions, it is assumed that the state process is communicating under any stationary policy, and that the simultaneous Doeblin condition holds. In this context, it is shown that if the cost function is bounded from below, and the superior limit average index is finite at some point, then (i) the optimal superior and inferior limit average value functions coincide and are constant, (ii) the optimal average cost is characterized via an extended version of the Collatz-Wielandt formula in the theory of positive matrices, and (iii) an optimality inequality is established, from which a stationary optimal policy is obtained. Moreover, an explicit example is given to show that, even if the cost function is bounded, the strict inequality may occur in the optimality relation.