Rate-Optimal Bayesian Simple Regret in Best Arm Identification

成果类型:
Article
署名作者:
Komiyama, Junpei; Ariu, Kaito; Kato, Masahiro; Qin, Chao
署名单位:
New York University; Royal Institute of Technology; Columbia University
刊物名称:
MATHEMATICS OF OPERATIONS RESEARCH
ISSN/ISSBN:
0364-765X
DOI:
10.1287/moor.2022.0011
发表日期:
2024
页码:
1629-1646
关键词:
multiarmed bandit convergence-rates allocation elimination algorithms DESIGN
摘要:
We consider best arm identification in the multiarmed bandit problem. Assuming certain continuity conditions of the prior, we characterize the rate of the Bayesian simple regret. Differing from Bayesian regret minimization, the leading term in the Bayesian simple regret derives from the region in which the gap between optimal and suboptimal p ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi arms is smaller than (log T)/T. We propose a simple and easy-to-compute algorithm with its leading term matching with the lower bound up to a constant factor; simulation results support our theoretical findings.