The value iteration algorithm in risk-sensitive average Markov decision chains with finite state space
成果类型:
Article
署名作者:
Cavazos-Cadena, R; Montes-de-Oca, R
署名单位:
Universidad Autonoma Metropolitana - Mexico
刊物名称:
MATHEMATICS OF OPERATIONS RESEARCH
ISSN/ISSBN:
0364-765X
DOI:
10.1287/moor.28.4.752.20515
发表日期:
2003
页码:
752-776
关键词:
Discrete-time
摘要:
This work concerns discrete-time Markov decision chains with finite state space and bounded costs. The controller has constant risk sensitivity A, and the performance of a control policy is measured by the corresponding risk-sensitive average cost criterion. Assuming that the optimality equation has a solution, it is shown that the value iteration scheme can be implemented to obtain, in a finite number of steps, (1) an approximation to the optimal A-sensitive average cost with an error less than a given tolerance, and (2) a stationary policy whose performance index is arbitrarily close to the optimal value. The argument used to establish these results is based on a modification of the original model, which is an extension of a transformation introduced by Schweitzer (1971) to analyze the the risk-neutral case.
来源URL: