Reinforcement learning in a prisoner's dilemma
成果类型:
Article
署名作者:
Dolgopolov, Arthur
署名单位:
University of Bielefeld
刊物名称:
GAMES AND ECONOMIC BEHAVIOR
ISSN/ISSBN:
0899-8256
DOI:
10.1016/j.geb.2024.01.004
发表日期:
2024
页码:
84-103
关键词:
Q-learning
stochastic stability
evolutionary game theory
collusion
Pricing-algorithms
摘要:
I characterize the outcomes of a class of model -free reinforcement learning algorithms, such as stateless Q -learning, in a prisoner's dilemma. The behavior is studied in the limit as players stop experimenting after sufficiently exploring their options. A closed form relationship between the learning rate and game payoffs reveals whether the players will learn to cooperate or defect. The findings have implications for algorithmic collusion and also apply to asymmetric learners with different experimentation rules.
来源URL: