您的位置: 首页 > 全球经管学术 > 顶刊追踪 > 顶尖期刊 > 管理科学与工程 > IEEE Transactions on Automatic Control > 2022 > 10期

Decentralized Learning for Optimality in Stochastic Dynamic Teams and Games With Local Control and Global State Information

成果类型：

Article

署名作者：

Yongacoglu, Bora; Arslan, Gurdal; Yuksel, Serdar

署名单位：

Queens University - Canada; University of Hawaii System

刊物名称：

IEEE TRANSACTIONS ON AUTOMATIC CONTROL

ISSN/ISSBN：

0018-9286

DOI：

10.1109/TAC.2021.3121228

发表日期：

2022

页码：

5230-5245

关键词：

games Stochastic processes COSTS CONVERGENCE Heuristic algorithms Reinforcement Learning Q-factor Cooperative control game theory Machine Learning stochastic games stochastic optimal control

摘要：

Stochastic dynamic teams and games are rich models for decentralized systems and challenging testing grounds for multiagent learning. Previous work that guaranteed team optimality assumed stateless dynamics, or an explicit coordination mechanism, or joint-control sharing. In this article, we present an algorithm with guarantees of convergence to team optimal policies in teams and common interest games. The algorithm is a two-timescale method that uses a variant of Q-learning on the finer timescale to perform policy evaluation while exploring the policy space on the coarser timescale. Agents following this algorithm are independent learners: they use only local controls, local cost realizations, and global state information, without access to controls of other agents. The results presented here are the first, to the best of our knowledge, to give formal guarantees of convergence to team optimality using independent learners in stochastic dynamic teams and common interest games.