PAC Reinforcement Learning Algorithm for General-Sum Markov Games
成果类型:
Article
署名作者:
Zehfroosh, Ashkan; Tanner, Herbert G.
署名单位:
University of Delaware
刊物名称:
IEEE TRANSACTIONS ON AUTOMATIC CONTROL
ISSN/ISSBN:
0018-9286
DOI:
10.1109/TAC.2022.3219340
发表日期:
2023
页码:
2821-2831
关键词:
games
Markov processes
Picture archiving and communication systems
Nash equilibrium
Q-learning
Approximation algorithms
CONVERGENCE
Markov game
multiagent system
probably approximately correct (PAC)
Reinforcement Learning
摘要:
This article presents a theoretical framework for probably approximately correct (PAC) multi-agent reinforcement learning (MARL) algorithms for Markov games. Using the idea of delayed Q-learning, this article extends the well-known Nash Q-learning algorithm to build a new PAC MARL algorithm for general-sum Markov games. In addition to guiding the design of a provably PAC MARL algorithm, the framework enables checking whether an arbitrary MARL algorithm is PAC. Comparative numerical results demonstrate the algorithm's performance and robustness.