您的位置: 首页 > 全球经管学术 > 顶刊追踪 > 顶尖期刊 > 管理科学与工程 > Mathematics of Operations Research > 2003 > 4期

The value iteration algorithm in risk-sensitive average Markov decision chains with finite state space

成果类型：

Article

署名作者：

Cavazos-Cadena, R; Montes-de-Oca, R

署名单位：

Universidad Autonoma Metropolitana - Mexico

刊物名称：

MATHEMATICS OF OPERATIONS RESEARCH

ISSN/ISSBN：

0364-765X

DOI：

10.1287/moor.28.4.752.20515

发表日期：

2003

页码：

752-776

关键词：

Discrete-time

摘要：

This work concerns discrete-time Markov decision chains with finite state space and bounded costs. The controller has constant risk sensitivity A, and the performance of a control policy is measured by the corresponding risk-sensitive average cost criterion. Assuming that the optimality equation has a solution, it is shown that the value iteration scheme can be implemented to obtain, in a finite number of steps, (1) an approximation to the optimal A-sensitive average cost with an error less than a given tolerance, and (2) a stationary policy whose performance index is arbitrarily close to the optimal value. The argument used to establish these results is based on a modification of the original model, which is an extension of a transformation introduced by Schweitzer (1971) to analyze the the risk-neutral case.

来源URL：

访问原文