Adaptive, Doubly Optimal No-Regret Learning in Strongly Monotone and Exp-Concave Games with Gradient Feedback

成果类型:
Article
署名作者:
Jordan, Michael; Lin, Tianyi; Zhou, Zhengyuan
署名单位:
University of California System; University of California Berkeley; University of California System; University of California Berkeley; Columbia University; New York University
刊物名称:
OPERATIONS RESEARCH
ISSN/ISSBN:
0030-364X
DOI:
10.1287/opre.2022.0446
发表日期:
2025
关键词:
stochastic variational-inequalities algorithms DYNAMICS schemes
摘要:
Online gradient descent (OGD) is well-known to be doubly optimal under strong convexity or monotonicity assumptions: (1) in the single -agent setting, it achieves an optimal regret of O ( log T ) for strongly convex cost functions, and (2) in the multiagent setting of strongly monotone games with each agent employing OGD, we obtain lastiterate convergence of the joint action to a unique Nash equilibrium at an optimal rate of O 1 ( ). . Whereas these finite -time guarantees highlight its merits, OGD has the drawback T that it requires knowing the strong convexity/monotonicity parameters. In this paper, we design a fully adaptive OGD algorithm, AdaOGD, , that does not require a priori knowledge of these parameters. In the single -agent setting, our algorithm achieves O ( log 2 ( T )) regret under strong convexity, which is optimal up to a log factor. Further, if each agent employs AdaOGD in strongly monotone games, the joint action converges in a last -iterate sense to a unique Nash equilibrium at a rate of O ( log 3 T ) , again optimal up to log factors. T We illustrate our algorithms in a learning version of the classic newsvendor problem, in which, because of lost sales, only (noisy) gradient feedback can be observed. Our results immediately yield the first feasible and near -optimal algorithm for both the single -retailer and multiretailer settings. We also extend our results to the more general setting of expconcave cost functions and games, using the online Newton step algorithm.