Strategy Synthesis for POMDPs in Robot Planning via Game-Based Abstractions

成果类型:
Article
署名作者:
Winterer, Leonore; Junges, Sebastian; Wimmer, Ralf; Jansen, Nils; Topcu, Ufuk; Katoen, Joost-Pieter; Becker, Bernd
署名单位:
University of Freiburg; University of California System; University of California Berkeley; University of Freiburg; Radboud University Nijmegen; University of Texas System; University of Texas Austin; RWTH Aachen University
刊物名称:
IEEE TRANSACTIONS ON AUTOMATIC CONTROL
ISSN/ISSBN:
0018-9286
DOI:
10.1109/TAC.2020.2990140
发表日期:
2021
页码:
1040-1054
关键词:
planning games Probabilistic logic Markov processes safety model checking probability distribution Partially Observable Markov Decision Process (POMDP) probabilistic model checking probabilistic two-player game (PG) robot navigation
摘要:
We study synthesis problems with constraints in partially observable Markov decision processes (POMDPs), where the objective is to compute a strategy for an agent that is guaranteed to satisfy certain safety and performance specifications. Verification and strategy synthesis for POMDPs are, however, computationally intractable in general. We alleviate this difficulty by focusing on planning applications and exploiting typical structural properties of such scenarios; for instance, we assume that the agent has the ability to observe its own position inside an environment. We propose an abstraction refinement framework, which turns such a POMDP model into a (fully observable) probabilistic two-player game (PG). For the obtained PGs, efficient verification and synthesis tools allow to determine strategies with optimal safety and performance measures, which approximate optimal schedulers on the POMDP. If the approximation is too coarse to satisfy the given specifications, a refinement scheme improves the computed strategies. As a running example, we use planning problems where an agent moves inside an environment with randomly moving obstacles and restricted observability. We demonstrate that the proposed method advances the state of the art by solving problems several orders of magnitude larger than those that can be handled by existing POMDP solvers. Furthermore, this method gives guarantees on safety constraints, which is not supported by the majority of the existing solvers.