您的位置: 首页 > 全球经管学术 > 顶刊追踪 > 顶尖期刊 > 管理科学与工程 > Operations Research > 2025 > 3期

Adaptive, Doubly Optimal No-Regret Learning in Strongly Monotone and Exp-Concave Games with Gradient Feedback

成果类型：

Article

署名作者：

Jordan, Michael; Lin, Tianyi; Zhou, Zhengyuan

署名单位：

University of California System; University of California Berkeley; University of California System; University of California Berkeley; Columbia University; New York University

刊物名称：

OPERATIONS RESEARCH

ISSN/ISSBN：

0030-364X

DOI：

10.1287/opre.2022.0446

发表日期：

2025

关键词：

stochastic variational-inequalities algorithms DYNAMICS schemes

摘要：

Online gradient descent (OGD) is well-known to be doubly optimal under strong convexity or monotonicity assumptions: (1) in the single -agent setting, it achieves an optimal regret of O ( log T ) for strongly convex cost functions, and (2) in the multiagent setting of strongly monotone games with each agent employing OGD, we obtain lastiterate convergence of the joint action to a unique Nash equilibrium at an optimal rate of O 1 ( ). . Whereas these finite -time guarantees highlight its merits, OGD has the drawback T that it requires knowing the strong convexity/monotonicity parameters. In this paper, we design a fully adaptive OGD algorithm, AdaOGD, , that does not require a priori knowledge of these parameters. In the single -agent setting, our algorithm achieves O ( log 2 ( T )) regret under strong convexity, which is optimal up to a log factor. Further, if each agent employs AdaOGD in strongly monotone games, the joint action converges in a last -iterate sense to a unique Nash equilibrium at a rate of O ( log 3 T ) , again optimal up to log factors. T We illustrate our algorithms in a learning version of the classic newsvendor problem, in which, because of lost sales, only (noisy) gradient feedback can be observed. Our results immediately yield the first feasible and near -optimal algorithm for both the single -retailer and multiretailer settings. We also extend our results to the more general setting of expconcave cost functions and games, using the online Newton step algorithm.