Discounted multiarmed bandit problems on a collection of machines with varying speeds

成果类型:
Article
署名作者:
Dunn, RT; Glazebrook, KD
署名单位:
University of Edinburgh
刊物名称:
MATHEMATICS OF OPERATIONS RESEARCH
ISSN/ISSBN:
0364-765X
DOI:
10.1287/moor.1030.0068
发表日期:
2004
页码:
266-279
关键词:
parallel machines scheduling jobs optimality policies flowtime systems
摘要:
This paper is the first to consider general multiarmed bandit problems on parallel machines working at different speeds. Block allocation policies make a once-for-all allocation of bandits to machines at time zero. In this class we describe how to achieve Blackwell optimality under given conditions. The block allocation policy identified allocates the bandits with the largest guaranteed reward rates to the machines operating at greatest speed. This policy is shown to be average-reward optimal in the class of general (nonanticipative, nonidling) policies.
来源URL: