您的位置: 首页 > 全球经管学术 > 顶刊追踪 > 顶尖期刊 > 管理科学与工程 > IEEE Transactions on Automatic Control > 2024 > 2期

Trust-Region Inverse Reinforcement Learning

成果类型：

Article

署名作者：

Cao, Kun; Xie, Lihua

署名单位：

Nanyang Technological University

刊物名称：

IEEE TRANSACTIONS ON AUTOMATIC CONTROL

ISSN/ISSBN：

0018-9286

DOI：

10.1109/TAC.2023.3274629

发表日期：

2024

页码：

1037-1044

关键词：

Inverse reinforcement learning (IRL) Pontryagin's maximum principle (PMP) trust region methods

摘要：

This article proposes a new unified inverse reinforcement learning framework based on trust-region methods and a recently proposed Pontryagin differential programming method in Jin et al.'s work (2020), which aims to learn the parameters in both the system model and the cost function for three types of problems, namely, N-player nonzero-sum multistage games, two-player zero-sum multistage games, and one-player optimal control, from demonstrated trajectories. Different from the existing frameworks using gradient to update learning parameters, our framework updates them with the candidate solution of trust-region subproblem, where its required gradient and Hessian are obtained by differentiating Pontryagin's maximum principle (PMP) equations once and twice, respectively. The differentiated equations are shown to be equivalent to the PMP equations for affine-quadratic games/optimal control problems and can be solved by some explicit recursions. Extensive simulation examples and comparisons are presented to demonstrate the effectiveness of our proposed algorithm.