您的位置: 首页 > 全球经管学术 > 顶刊追踪 > 顶尖期刊 > 管理科学与工程 > IEEE Transactions on Automatic Control > 2025 > 2期

Generalized Policy Improvement Algorithms With Theoretically Supported Sample Reuse

成果类型：

Article

署名作者：

Queeney, James; Paschalidis, Ioannis Ch.; Cassandras, Christos G.

署名单位：

Boston University; Boston University

刊物名称：

IEEE TRANSACTIONS ON AUTOMATIC CONTROL

ISSN/ISSBN：

0018-9286

DOI：

10.1109/TAC.2024.3454011

发表日期：

2025

页码：

1236-1243

关键词：

approximation algorithms optimization tv training Task analysis trajectory Heuristic algorithms Policy improvement policy optimization reinforcement learning (RL) sample reuse

摘要：

We develop a new class of model-free deep reinforcement learning algorithms for data-driven, learning-based control. Our Generalized Policy Improvement algorithms combine the policy improvement guarantees of on-policy methods with the efficiency of sample reuse, addressing a tradeoff between two important deployment requirements for real-world control: 1) practical performance guarantees; and 2) data efficiency. We demonstrate the benefits of this new class of algorithms through extensive experimental analysis on a broad range of simulated control tasks.