Risk preferences of learning algorithms

成果类型:
Article
署名作者:
Haupt, Andreas; Narayanan, Aroon
署名单位:
Massachusetts Institute of Technology (MIT)
刊物名称:
GAMES AND ECONOMIC BEHAVIOR
ISSN/ISSBN:
0899-8256
DOI:
10.1016/j.geb.2024.09.013
发表日期:
2024
页码:
415-426
关键词:
Online learning Behavior attribution fairness
摘要:
Many economic decision-makers today rely on learning algorithms for important decisions. This paper shows that a widely used learning algorithm-epsilon-Greedy-exhibits emergent risk aversion, favoring actions with lower payoff variance. When presented with actions of the same expectated payoff, under a wide range of conditions, epsilon-Greedy chooses the lower-variance action with probability approaching one. This emergent preference can have wide-ranging consequences, from inequity to homogenization, and holds transiently even when the higher-variance action has a strictly higher expected payoff. We discuss two methods to restore risk neutrality. The first method reweights data as a function of how likely an action is chosen. The second method employs optimistic payoff estimates for actions that have not been taken often.