Bandit problems with infinitely many arms
成果类型:
Article
署名作者:
Berry, DA; Chen, RW; Zame, A; Heath, DC; Shepp, LA
署名单位:
Duke University; University of Miami; Cornell University; AT&T
刊物名称:
ANNALS OF STATISTICS
ISSN/ISSBN:
0090-5364
DOI:
10.1214/aos/1069362389
发表日期:
1997
页码:
2103-2116
关键词:
job
摘要:
We consider a bandit problem consisting of a sequence of n choices from an infinite number of Bernoulli arms, with n --> infinity. The objective is to minimize the long-run failure rate. The Bernoulli parameters are independent observations from a distribution F. We first assume F to be the uniform distribution on (0, 1) and consider various extensions. In the uniform case we show that the best lower bound for the expected failure proportion is between root 2/root n and 2/root n and we exhibit classes of strategies that achieve the latter.