您的位置: 首页 > 全球经管学术 > 顶刊追踪 > 顶尖期刊 > 管理科学与工程 > IEEE Transactions on Automatic Control > 2021 > 4期

Optimism-Based Adaptive Regulation of Linear-Quadratic Systems

成果类型：

Article

署名作者：

Faradonbeh, Mohamad Kazem Shirani; Tewari, Ambuj; Michailidis, George

署名单位：

State University System of Florida; University of Florida; State University System of Florida; University of Florida; University of Michigan System; University of Michigan; University of Michigan System; University of Michigan

刊物名称：

IEEE TRANSACTIONS ON AUTOMATIC CONTROL

ISSN/ISSBN：

0018-9286

DOI：

10.1109/TAC.2020.2998952

发表日期：

2021

页码：

1802-1808

关键词：

Regulation Adaptive systems Upper bound estimation Probabilistic logic uncertainty Eigenvalues and eigenfunctions Certainty equivalence (CE) exploration-exploitation optimism in the face of uncertainty (OFU) Reinforcement Learning Regret Bounds

摘要：

The main challenge for adaptive regulation of linear-quadratic systems is the tradeoff between identification and control. An adaptive policy needs to address both the estimation of unknown dynamics parameters (exploration), as well as the regulation of the underlying system (exploitation). To this end, optimism-based methods that bias the identification in favor of optimistic approximations of the true parameter are employed in the literature. A number of asymptotic results have been established, but their finite-time counterparts are few, with important restrictions. This article establishes results for the worst-case regret of optimism-based adaptive policies. The presented high probability upper bounds are optimal up to logarithmic factors. The nonasymptotic analysis of this article requires the following very mild assumptions: stabilizability of the system's dynamics, and limiting the degree of heaviness of the noise distribution. To establish such bounds, certain novel techniques are developed to comprehensively address the probabilistic behavior of dependent random matrices with heavy-tailed distributions.