Myopic Bounds for Optimal Policy of POMDPs: An Extension of Lovejoy's Structural Results

成果类型:
Article
署名作者:
Krishnamurthy, Vikram; Pareek, Udit
署名单位:
University of British Columbia
刊物名称:
OPERATIONS RESEARCH
ISSN/ISSBN:
0030-364X
DOI:
10.1287/opre.2014.1332
发表日期:
2015
页码:
428-434
关键词:
markov decision-processes
摘要:
This paper provides a relaxation of the sufficient conditions and an extension of the structural results for partially observed Markov decision processes (POMDPs) obtained by Lovejoy in 1987. Sufficient conditions are provided so that the optimal policy can be upper and lower bounded by judiciously chosen myopic policies. These myopic policy bounds are constructed to maximize the volume of belief states where they coincide with the optimal policy. Numerical examples illustrate these myopic bounds for both continuous and discrete observation sets.
来源URL: