Technical Note-A Note on the Equivalence of Upper Confidence Bounds and Gittins Indices for Patient Agents

成果类型:
Article
署名作者:
Russo, Daniel
署名单位:
Columbia University
刊物名称:
OPERATIONS RESEARCH
ISSN/ISSBN:
0030-364X
DOI:
10.1287/opre.2020.1987
发表日期:
2021
页码:
273-278
关键词:
information
摘要:
This note gives a short, self-contained proof of a sharp connection between Gittins indices and Bayesian upper confidence bound algorithms. I consider a Gaussian multiarmed bandit problem with discount factor.. The Gittins index of an arm is shown to equal the gamma-quantile of the posterior distribution of the arm's mean plus an error term that vanishes as gamma -> 1. In this sense, for sufficiently patient agents, a Gittins index measures the highest plausible mean-reward of an arm in a manner equivalent to an upper confidence bound.