Learning in games with continuous action sets and unknown payoff functions

成果类型:
Article
署名作者:
Mertikopoulos, Panayotis; Zhou, Zhengyuan
署名单位:
Communaute Universite Grenoble Alpes; Institut National Polytechnique de Grenoble; Universite Grenoble Alpes (UGA); Centre National de la Recherche Scientifique (CNRS); Inria; Stanford University
刊物名称:
MATHEMATICAL PROGRAMMING
ISSN/ISSBN:
0025-5610
DOI:
10.1007/s10107-018-1254-8
发表日期:
2019
页码:
465-507
关键词:
dynamics inequalities CONVERGENCE equilibrium FLOWS
摘要:
This paper examines the convergence of no-regret learning in games with continuous action sets. For concreteness, we focus on learning via dual averaging, a widely used class of no-regret learning schemes where players take small steps along their individual payoff gradients and then mirror the output back to their action sets. In terms of feedback, we assume that players can only estimate their payoff gradients up to a zero-mean error with bounded variance. To study the convergence of the induced sequence of play, we introduce the notion of variational stability, and we show that stable equilibria are locally attracting with high probability whereas globally stable equilibria are globally attracting with probability 1. We also discuss some applications to mixed-strategy learning in finite games, and we provide explicit estimates of the method's convergence speed.