Bounded Synthesis and Reinforcement Learning of Supervisors for Stochastic Discrete Event Systems With LTL Specifications
成果类型:
Article
署名作者:
Oura, Ryohei; Ushio, Toshimitsu; Sakakibara, Ami
署名单位:
University of Osaka
刊物名称:
IEEE TRANSACTIONS ON AUTOMATIC CONTROL
ISSN/ISSBN:
0018-9286
DOI:
10.1109/TAC.2024.3376723
发表日期:
2024
页码:
6668-6683
关键词:
Probabilistic logic
Automata
safety
Discrete-event systems
Supervisory control
Stochastic processes
games
Bounded synthesis
linear temporal logic (LTL)
reinforcement learning (RL)
stochastic discrete event systems (SDESs)
摘要:
In this article, we consider supervisory control of stochastic discrete event systems (SDESs) under linear temporal logic specifications. Applying the bounded synthesis, we reduce the supervisory synthesis to the problem of satisfying a safety condition. First, we consider a directed controller that allows at most one controllable event to be enabled. We assign a negative reward to the unsafe states and introduce an expected return with a state-dependent discount factor. We compute a winning region and a directed controller with the maximum satisfaction probability using a dynamic programming method, where the expected return is used as a value function. Next, we construct a permissive supervisor via the optimal value function. We show that the supervisor accomplishes the maximum satisfaction probability and maximizes the reachable set within the winning region. Finally, for an unknown SDES, we propose a two-stage model-free reinforcement learning method for efficient learning of the winning region and the directed controllers with the maximum satisfaction probability. We also demonstrate the effectiveness of the proposed method by simulation.