您的位置: 首页 > 全球经管学术 > 顶刊追踪 > 顶尖期刊 > 管理科学与工程 > IEEE Transactions on Automatic Control > 2021 > 3期

Strategy Synthesis for POMDPs in Robot Planning via Game-Based Abstractions

成果类型：

Article

署名作者：

Winterer, Leonore; Junges, Sebastian; Wimmer, Ralf; Jansen, Nils; Topcu, Ufuk; Katoen, Joost-Pieter; Becker, Bernd

署名单位：

University of Freiburg; University of California System; University of California Berkeley; University of Freiburg; Radboud University Nijmegen; University of Texas System; University of Texas Austin; RWTH Aachen University

刊物名称：

IEEE TRANSACTIONS ON AUTOMATIC CONTROL

ISSN/ISSBN：

0018-9286

DOI：

10.1109/TAC.2020.2990140

发表日期：

2021

页码：

1040-1054

关键词：

planning games Probabilistic logic Markov processes safety model checking probability distribution Partially Observable Markov Decision Process (POMDP) probabilistic model checking probabilistic two-player game (PG) robot navigation

摘要：

We study synthesis problems with constraints in partially observable Markov decision processes (POMDPs), where the objective is to compute a strategy for an agent that is guaranteed to satisfy certain safety and performance specifications. Verification and strategy synthesis for POMDPs are, however, computationally intractable in general. We alleviate this difficulty by focusing on planning applications and exploiting typical structural properties of such scenarios; for instance, we assume that the agent has the ability to observe its own position inside an environment. We propose an abstraction refinement framework, which turns such a POMDP model into a (fully observable) probabilistic two-player game (PG). For the obtained PGs, efficient verification and synthesis tools allow to determine strategies with optimal safety and performance measures, which approximate optimal schedulers on the POMDP. If the approximation is too coarse to satisfy the given specifications, a refinement scheme improves the computed strategies. As a running example, we use planning problems where an agent moves inside an environment with randomly moving obstacles and restricted observability. We demonstrate that the proposed method advances the state of the art by solving problems several orders of magnitude larger than those that can be handled by existing POMDP solvers. Furthermore, this method gives guarantees on safety constraints, which is not supported by the majority of the existing solvers.