您的位置: 首页 > 全球经管学术 > 顶刊追踪 > 顶尖期刊 > 管理科学与工程 > IEEE Transactions on Automatic Control > 2024 > 11期

Asynchronous Distributed Reinforcement Learning for LQR Control via Zeroth-Order Block Coordinate Descent

成果类型：

Article

署名作者：

Jing, Gangshan; Bai, He; George, Jemin; Chakrabortty, Aranya; Sharma, Piyush K.

署名单位：

Chongqing University; Oklahoma State University System; Oklahoma State University - Stillwater; North Carolina State University

刊物名称：

IEEE TRANSACTIONS ON AUTOMATIC CONTROL

ISSN/ISSBN：

0018-9286

DOI：

10.1109/TAC.2024.3386061

发表日期：

2024

页码：

7524-7539

关键词：

costs CONVERGENCE cost function estimation linear programming Distance learning Computer aided instruction Distributed Learning linear quadratic regulator multiagent systems (MASs) reinforcement learning (RL) zeroth-order optimization (ZO)

摘要：

Recently introduced distributed zeroth-order optimization (ZOO) algorithms have shown their utility in distributed reinforcement learning (RL). Unfortunately, in the gradient estimation process, almost all of them require random samples with the same dimension as the global variable and/or require evaluation of the global cost function, which may induce high estimation variance for large-scale networks. In this paper, we propose a novel distributed zeroth-order algorithm by leveraging the network structure inherent in the optimization objective, which allows each agent to estimate its local gradient by local cost evaluation independently, without use of any consensus protocol. The proposed algorithm exhibits an asynchronous update scheme, and works for stochastic nonconvex optimization with a possibly nonconvex feasible domain based on the block coordinate descent method. The algorithm is later employed as a distributed model-free RL algorithm for distributed linear quadratic regulator design. We provide an empirical validation of the proposed algorithm to benchmark its performance on convergence rate and variance against a centralized ZOO algorithm.

来源URL：

访问原文