您的位置: 首页 > 全球经管学术 > 顶刊追踪 > 顶尖期刊 > 运营管理 > Operations Research > 1991 > 1期

COMPUTATIONALLY FEASIBLE BOUNDS FOR PARTIALLY OBSERVED MARKOV DECISION-PROCESSES

成果类型：

Article

署名作者：

LOVEJOY, WS

刊物名称：

OPERATIONS RESEARCH

ISSN/ISSBN：

0030-364X

DOI：

10.1287/opre.39.1.162

发表日期：

1991

页码：

162-175

关键词：

Dynamic Programming PARTIALLY OBSERVED MARKOV DECISION PROCESSES dynamic programming Markov BAYESIAN PROGRAMMING AND INFINITE STATE MARKOV MODELS

摘要：

A partially observed Markov decision process (POMDP) is a sequential decision problem where information concerning parameters of interest is incomplete, and possible actions include sampling, surveying, or otherwise collecting additional information. Such problems can theoretically be solved as dynamic programs, but the relevant state space is infinite, which inhibits algorithmic solution. This paper explains how to approximate the state space by a finite grid of points, and use that grid to construct upper and lower value function bounds, generate approximate nonstationary and stationary policies, and bound the value loss relative to optimal for using these policies in the decision problem. A numerical example illustrates the methodology.