BATCHED BANDIT PROBLEMS
成果类型:
Article
署名作者:
Perchet, Vianney; Rigollet, Philippe; Chassang, Sylvain; Snowberg, Erik
署名单位:
Centre National de la Recherche Scientifique (CNRS); CNRS - National Institute for Mathematical Sciences (INSMI); Universite Paris Cite; Sorbonne Universite; Inria; Massachusetts Institute of Technology (MIT); Massachusetts Institute of Technology (MIT); Princeton University; California Institute of Technology; National Bureau of Economic Research
刊物名称:
ANNALS OF STATISTICS
ISSN/ISSBN:
0090-5364
DOI:
10.1214/15-AOS1381
发表日期:
2016
页码:
660-681
关键词:
regret bounds
selecting 1
allocation
tests
MODEL
摘要:
Motivated by practical applications, chiefly clinical trials, we study the regret achievable for stochastic bandits under the constraint that the employed policy must split trials into a small number of batches. We propose a simple policy, and show that a very small number of batches gives close to minimax optimal regret bounds. As a byproduct, we derive optimal policies with low switching cost for stochastic bandits.