您的位置: 首页 > 全球经管学术 > 顶刊追踪 > 顶尖期刊 > 管理科学与工程 > IEEE Transactions on Automatic Control > 2021 > 12期

Finite-Sample Analysis for Decentralized Batch Multiagent Reinforcement Learning With Networked Agents

成果类型：

Article

署名作者：

Zhang, Kaiqing; Yang, Zhuoran; Liu, Han; Zhang, Tong; Basar, Tamer

署名单位：

University of Illinois System; University of Illinois Urbana-Champaign; University of Illinois System; University of Illinois Urbana-Champaign; Princeton University; Northwestern University; Hong Kong University of Science & Technology

刊物名称：

IEEE TRANSACTIONS ON AUTOMATIC CONTROL

ISSN/ISSBN：

0018-9286

DOI：

10.1109/TAC.2021.3049345

发表日期：

2021

页码：

5925-5940

关键词：

games Markov processes Approximation algorithms game theory Heuristic algorithms Function approximation Reinforcement Learning Machine Learning multiagent systems networked control systems Statistical learning

摘要：

Despite the increasing interest in multiagent reinforcement learning (MARL) in multiple communities, understanding its theoretical foundation has long been recognized as a challenging problem. In this article, we address this problem by providing a finite-sample analysis for decentralized batch MARL. Specifically, we consider a type of mixed MARL setting with both cooperative and competitive agents, where two teams of agents compete in a zero-sum game setting, while the agents within each team collaborate by communicating over a time-varying network. This setting covers many conventional MARL settings in the literature. We then develop batch MARL algorithms that can be implemented in a decentralized fashion, and quantify the finite-sample errors of the estimated action-value functions. Our error analysis captures how the function class, the number of samples within each iteration, and the number of iterations determine the statistical accuracy of the proposed algorithms. Our results, compared to the finite-sample bounds for single-agent reinforcement learning, involve additional error terms caused by decentralized computation, which is inherent in our decentralized MARL setting. This article provides the first finite-sample analysis for batch MARL, a step toward rigorous theoretical understanding of general MARL algorithms in the finite-sample regime.

来源URL：

访问原文