Adaptive Optimal Control of Unknown Nonlinear Systems via Homotopy-Based Policy Iteration
成果类型:
Article
署名作者:
Chen, Ci; Lewis, Frank L.; Xie, Kan; Xie, Shengli
署名单位:
Guangdong University of Technology; Guangdong University of Technology; Guangdong University of Technology
刊物名称:
IEEE TRANSACTIONS ON AUTOMATIC CONTROL
ISSN/ISSBN:
0018-9286
DOI:
10.1109/TAC.2023.3339660
发表日期:
2024
页码:
3396-3403
关键词:
Adaptive optimal control
homotopic
initial admissible control
policy iteration (PI)
reinforcement learning (RL)
摘要:
As one efficient technique in reinforcement learning, policy iteration (PI) requires an initial admissible (or stabilizing for linear systems) control policy that renders the existing PI-based results to be model dependent. To attain a completely data-driven adaptive optimal control, this article suggests integrating a homotopic design with PI for unknown continuous-time nonlinear systems. Technically, we leverage a homotopic constant to construct an artificially stable system that allows zero control to initialize PI. Utilizing a homotopic strategy, we recursively update the artificial system and then enforce it to gradually recover the original system. This ultimately allows us to obtain an admissible control policy in a finite number of iterations without carrying out a model-based initialization. Once the admissible control is obtained, the proposed homotopic PI inherits fast convergence from the traditional PI technique and ensures learning the optimal control solution from the data measured from unknown nonlinear systems.