您的位置: 首页 > 全球经管学术 > 顶刊追踪 > 顶尖期刊 > 管理科学与工程 > IEEE Transactions on Automatic Control > 2025 > 3期

Convergence and Sample Complexity of Policy Gradient Methods for Stabilizing Linear Systems

成果类型：

Article

署名作者：

Zhao, Feiran; Fu, Xingyun; You, Keyou

署名单位：

Tsinghua University; Tsinghua University

刊物名称：

IEEE TRANSACTIONS ON AUTOMATIC CONTROL

ISSN/ISSBN：

0018-9286

DOI：

10.1109/TAC.2024.3455508

发表日期：

2025

页码：

1455-1466

关键词：

complexity theory COSTS CONVERGENCE Linear systems trajectory vectors Search problems Policy gradient (PG) Sample Complexity stabilization of linear systems the discounted linear quadratic regulator (LQR)

摘要：

System stabilization via policy gradient (PG) methods has drawn increasing attention in both control and machine learning communities. In this article, we study their convergence and sample complexity for stabilizing linear time-invariant systems in terms of the number of system rollouts. Our analysis is built upon a discounted linear quadratic regulator (LQR) method which alternatively updates the policy and the discount factor of the LQR problem. First, we propose an explicit rule to adaptively adjust the discount factor by exploring the stability margin of a linear control policy. Then, we establish the sample complexity of PG methods for stabilization, which only adds a coefficient logarithmic in the spectral radius of the state matrix to that for solving the LQR problem with a prior stabilizing policy. Finally, we perform simulations to validate our theoretical findings and demonstrate the effectiveness of our method on a class of nonlinear systems.