您的位置: 首页 > 全球经管学术 > 顶刊追踪 > 顶尖期刊 > 管理科学与工程 > Mathematics of Operations Research > 2021 > 1期

Online Learning over a Finite Action Set with Limited Switching

成果类型：

Article

署名作者：

Altschuler, Jason M.; Talwar, Kunal

署名单位：

Massachusetts Institute of Technology (MIT); Alphabet Inc.; Google Incorporated

刊物名称：

MATHEMATICS OF OPERATIONS RESEARCH

ISSN/ISSBN：

0364-765X

DOI：

10.1287/moor.2020.1052

发表日期：

2021

页码：

179-203

关键词：

decision regret well

摘要：

This paper studies the value of switching actions in the Prediction From Experts problem (PFE) and Adversarial Multiarmed Bandits problem (MAB). First, we revisit the well-studied and practically motivated setting of PFE with switching costs. Many algorithms achieve the minimax optimal order for both regret and switches in expectation; however, high probability guarantees are an open problem. We present the first algorithms that achieve this optimal order for both quantities with high probability. This also implies the first high probability guarantees for several other problems, and, in particular, is efficiently adaptable to online combinatorial optimization with limited switching. Next, to investigate the value of switching actions more granularly, we introduce the switching budget setting, which limits algorithms to a fixed number of (costless) switches. Using this result and several reductions, we unify previous work and completely characterize the complexity of this switching budget setting up to small polylogarithmic factors: for both PFE and MAB, for all switching budgets, and for both expectation and high probability guarantees. Interestingly, as the switching budget decreases, the minimax regret rate admits a phase transition for PFE but not for MAB. These results recover and generalize the known minimax rates for the (arbitrary) switching cost setting.

来源URL：

访问原文