您的位置: 首页 > 全球经管学术 > 顶刊追踪 > 顶尖期刊 > 管理科学与工程 > Mathematics of Operations Research > 2020 > 1期

Polynomial Time Algorithms for Branching Markov Decision Processes and Probabilistic Min (Max) Polynomial Bellman Equations

成果类型：

Article

署名作者：

Etessami, Kousha; Stewart, Alistair; Yannakakis, Mihalis

署名单位：

University of Edinburgh; University of Southern California; Columbia University

刊物名称：

MATHEMATICS OF OPERATIONS RESEARCH

ISSN/ISSBN：

0364-765X

DOI：

10.1287/moor.2018.0970

发表日期：

2020

页码：

34-62

关键词：

Complexity

摘要：

We show that one can compute the least nonnegative solution (also known as the least fixed point) for a system of probabilistic min (max) polynomial equations, to any desired accuracy epsilon > 0 in time polynomial in both the encoding size of the system and in log(1/epsilon). These are Bellman optimality equations for important classes of infinite-state Markov decision processes (MDPs), including branching MDPs (BMDPs), which generalize classic multitype branching stochastic processes. We thus obtain the first polynomial time algorithm for computing, to any desired precision, optimal (maximum and minimum) extinction probabilities for BMDPs. Our algorithms are based on a novel generalization of Newton's method, which employs linear programming in each iteration. We also provide polynomial-time (P-time) algorithms for computing an epsilon-optimal policy for both maximizing and minimizing extinction probabilities in a BMDP, whereas we note a hardness result for computing an exact optimal policy. Furthermore, improving on prior results, we provide more efficient P-time algorithms for qualitative analysis of BMDPs, that is, for determining whether the maximum or minimum extinction probability is 1, and, if so, computing a policy that achieves this. We also observe some complexity consequences of our results for branching simple stochastic games, which generalize BMDPs.