Decentralized Nash Equilibria Learning for Online Game With Bandit Feedback
成果类型:
Article
署名作者:
Meng, Min; Li, Xiuxian; Chen, Jie
署名单位:
Tongji University; Tongji University
刊物名称:
IEEE TRANSACTIONS ON AUTOMATIC CONTROL
ISSN/ISSBN:
0018-9286
DOI:
10.1109/TAC.2023.3342850
发表日期:
2024
页码:
4050-4057
关键词:
Distributed online learning
generalized Nash equilibrium
mirror descent
one-point delayed bandit feedback
online game
摘要:
This article studies distributed online bandit learning of generalized Nash equilibria for online games, where the cost functions of all players and coupled constraints are time-varying. The function values, rather than full information about cost and local constraint functions, are revealed to local players with time delays. The goal of each player is to selfishly minimize its own cost function with no future information, subject to a strategy set constraint and time-varying coupled inequality constraints. To this end, a distributed online algorithm based on mirror descent and one-point delayed bandit feedback is designed for seeking generalized Nash equilibria in the online game. It is shown that the devised online algorithm achieves sublinear expected regrets and accumulated constraint violation if the path variation of the generalized Nash equilibrium sequence is sublinear. Simulations are presented to illustrate the efficiency of the theoretical result.