您的位置: 首页 > 全球经管学术 > 顶刊追踪 > 顶尖期刊 > 管理科学与工程 > Mathematics of Operations Research > 2022 > 1期

Finite-Memory Strategies in POMDPs with Long-Run Average Objectives

成果类型：

Article

署名作者：

Chatterjee, Krishnendu; Saona, Raimundo; Ziliotto, Bruno

署名单位：

Institute of Science & Technology - Austria; Universite PSL; Universite Paris-Dauphine; Centre National de la Recherche Scientifique (CNRS)

刊物名称：

MATHEMATICS OF OPERATIONS RESEARCH

ISSN/ISSBN：

0364-765X

DOI：

10.1287/moor.2020.1116

发表日期：

2022

页码：

100-119

关键词：

markov decision-processes games

摘要：

Partially observable Markov decision processes (POMDPs) are standard models for dynamic systems with probabilistic and nondeterministic behaviour in uncertain environments. We prove that in POMDPs with long-run average objective, the decision maker has approximately optimal strategies with finite memory. This implies notably that approximating the long-run value is recursively enumerable, as well as a weak continuity property of the value with respect to the transition function.