您的位置: 首页 > 全球经管学术 > 顶刊追踪 > 顶尖期刊 > 管理科学与工程 > IEEE Transactions on Automatic Control > 2024 > 2期

On Supervised Online Rolling-Horizon Control for Infinite-Horizon Discounted Markov Decision Processes

成果类型：

Article

署名作者：

Chang, Hyeong Soo

署名单位：

Sogang University

刊物名称：

IEEE TRANSACTIONS ON AUTOMATIC CONTROL

ISSN/ISSBN：

0018-9286

DOI：

10.1109/TAC.2023.3274791

发表日期：

2024

页码：

1060-1065

关键词：

Markov processes Heuristic algorithms Markov decision process (MDP) policy iteration (PI) policy switching rolling horizon control

摘要：

This note revisits the rolling-horizon control approach to the problem of Markov decision process (MDP) with infinite-horizon discounted expected reward criterion. Distinguished from the classical value-iteration approaches, we develop an asynchronous online algorithm based on policy iteration integrated with a multipolicy improvement method of policy switching. A sequence of monotonically improving solutions to the forecast-horizon sub-MDP is generated by updating the current solution only at the currently visited state, building in effect a rolling-horizon control policy for the MDP over infinite horizon. Feedbacks from supervisors, if available, can be also incorporated while updating. We focus on the convergence issue with a relation to the transition structure of the MDP. Either a global convergence to an optimal forecast-horizon policy or a local convergence to a locally-optimal fixed-policy in a finite time is achieved by the algorithm depending on the structure.