您的位置: 首页 > 全球经管学术 > 顶刊追踪 > 顶尖期刊 > 管理科学与工程 > Operations Research > 2010 > 1期

Acceleration Operators in the Value Iteration Algorithms for Markov Decision Processes

成果类型：

Article

署名作者：

Shlakhter, Oleksandr; Lee, Chi-Guhn; Khmelev, Dmitry; Jaber, Nasser

署名单位：

University of Toronto; University of Toronto

刊物名称：

OPERATIONS RESEARCH

ISSN/ISSBN：

0030-364X

DOI：

10.1287/opre.1090.0705

发表日期：

2010

页码：

193-202

关键词：

摘要：

We study the general approach to accelerating the convergence of the most widely used solution method of Markov decision processes (MDPs) with the total expected discounted reward. Inspired by the monotone behavior of the contraction mappings in the feasible set of the linear programming problem equivalent to the MDP, we establish a class of operators that can be used in combination with a contraction mapping operator in the standard value iteration algorithm and its variants. We then propose two such operators, which can be easily implemented as part of the value iteration algorithm and its variants. Numerical studies show that the computational savings can be significant especially when the discount factor approaches one and the transition probability matrix becomes dense, in which the standard value iteration algorithm and its variants suffer from slow convergence.