您的位置: 首页 > 全球经管学术 > 顶刊追踪 > 顶尖期刊 > 管理科学与工程 > IEEE Transactions on Automatic Control > 2023 > 5期

Efficient Off-Policy Q-Learning for Data-Based Discrete-Time LQR Problems

成果类型：

Article

署名作者：

Lopez, Victor G.; Alsalti, Mohammad; Mueller, Matthias A.

署名单位：

Leibniz University Hannover

刊物名称：

IEEE TRANSACTIONS ON AUTOMATIC CONTROL

ISSN/ISSBN：

0018-9286

DOI：

10.1109/TAC.2023.3235967

发表日期：

2023

页码：

2922-2933

关键词：

Q-learning Heuristic algorithms data models CONVERGENCE trajectory Prediction algorithms Linear systems Data-based control optimal control reinforcement learning (RL)

摘要：

This article introduces and analyzes an improved Q-learning algorithm for discrete-time linear time-invariant systems. The proposed method does not require any knowledge of the system dynamics, and it enjoys significant efficiency advantages over other data-based optimal control methods in the literature. This algorithm can be fully executed offline, as it does not require to apply the current estimate of the optimal input to the system as in on-policy algorithms. It is shown that a PE input, defined from an easily tested matrix rank condition, guarantees the convergence of the algorithm. A data-based method is proposed to design the initial stabilizing feedback gain that the algorithm requires. Robustness of the algorithm in the presence of noisy measurements is analyzed. We compare the proposed algorithm in simulation to different direct and indirect data-based control design methods.