Best-response dynamics in zero-sum stochastic games
成果类型:
Article
署名作者:
Leslie, David S.; Perkins, Steven; Xu, Zibo
署名单位:
Lancaster University; Singapore University of Technology & Design
刊物名称:
JOURNAL OF ECONOMIC THEORY
ISSN/ISSBN:
0022-0531
DOI:
10.1016/j.jet.2020.105095
发表日期:
2020
关键词:
Stochastic games
best-response dynamics
zero-sum games
CONVERGENCE
摘要:
We define and analyse three learning dynamics for two-player zero-sum discounted-payoff stochastic games. A continuous-time best-response dynamic in mixed strategies is proved to converge to the set of Nash equilibrium stationary strategies. Extending this, we introduce a fictitious-play-like process in a continuous-time embedding of a stochastic zero-sum game, which is again shown to converge to the set of Nash equilibrium strategies. Finally, we present a modified 8-converging best-response dynamic, in which the discount rate converges to 1, and the learned value converges to the asymptotic value of the zero-sum stochastic game. The critical feature of all the dynamic processes is a separation of adaption rates: beliefs about the value of states adapt more slowly than the strategies adapt, and in the case of the 8-converging dynamic the discount rate adapts more slowly than everything else. (c) 2020 The Authors. Published by Elsevier Inc. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).