您的位置: 首页 > 全球经管学术 > 顶刊追踪 > 顶尖期刊 > 管理科学与工程 > IEEE Transactions on Automatic Control > 2023 > 11期

A Generalized Stacked Reinforcement Learning Method for Sampled Systems

成果类型：

Article

署名作者：

Osinenko, Pavel; Dobriborsci, Dmitrii; Yaremenko, Grigory; Malaniya, Georgiy

署名单位：

Skolkovo Institute of Science & Technology

刊物名称：

IEEE TRANSACTIONS ON AUTOMATIC CONTROL

ISSN/ISSBN：

0018-9286

DOI：

10.1109/TAC.2023.3250032

发表日期：

2023

页码：

7006-7013

关键词：

Mobile robot model predictive control (MPC) optimal control Q-learning reinforcement learning (RL)

摘要：

A common setting of reinforcement learning (RL) is a Markov decision process (MDP) in which the environment is a stochastic discrete-time dynamical system. Whereas MDPs are suitable in such applications as video games or puzzles, physical systems are time continuous. A general variant of RL is of digital format, where updates of the value (or cost) and policy are performed at discrete moments in time. The agent-environment loop then amounts to a sampled system, whereby sample-and-hold is a specific case. In this article, we propose and benchmark two RL methods suitable for sampled systems. Specifically, we hybridize model predictive control with critics learning the optimal Q- and value (or cost-to-go) function. Optimality is analyzed and performance comparison is done in an experimental case study with a mobile robot.