Technical Note-On Adaptivity in Nonstationary Stochastic Optimization with Bandit Feedback
成果类型:
Article
署名作者:
Wang, Yining
署名单位:
University of Texas System; University of Texas Dallas
刊物名称:
OPERATIONS RESEARCH
ISSN/ISSBN:
0030-364X
DOI:
10.1287/opre.2022.0576
发表日期:
2025
关键词:
摘要:
In this paper, we study the nonstationary stochastic optimization problem with bandit feedback and dynamic regret measures. The seminal work of Besbes et al. (2015) shows that, when aggregated function changes are known a priori, a simple restarting algorithm attains the optimal dynamic regret. In this work, we design a stochastic optimi-zation algorithm with fixed step sizes, which, combined with the multiscale sampling framework in existing research, achieves the optimal dynamic regret in nonstationary sto-chastic optimization without prior knowledge of function changing budget, thereby clos-ing a question that has been open for a while. We also establish an additional result showing that any algorithm achieving good regret against stationary benchmarks with high probability could be automatically converted to an algorithm that achieves good & RADIC;ffiffiffi regret against dynamic benchmarks (for problems that admit O & SIM;( T) regret against station-ary benchmarks in fully adversarial settings, a dynamic regret of O & SIM;(V113T T213) is expected), which is potentially applicable to a wide class of bandit convex optimization and other types of bandit algorithms.
来源URL: