Proximal-Like Incremental Aggregated Gradient Method with Linear Convergence Under Bregman Distance Growth Conditions

成果类型:
Article
署名作者:
Zhang, Hui; Dai, Yu-Hong; Guo, Lei; Peng, Wei
署名单位:
National University of Defense Technology - China; Chinese Academy of Sciences; Academy of Mathematics & System Sciences, CAS; East China University of Science & Technology
刊物名称:
MATHEMATICS OF OPERATIONS RESEARCH
ISSN/ISSBN:
0364-765X
DOI:
10.1287/moor.2019.1047
发表日期:
2021
页码:
61-81
关键词:
1st-order methods monotone-operators descent optimization continuity convexity algorithm SUM
摘要:
We introduce a unified algorithmic framework, called the proximal-like incremental aggregated gradient (PLIAG) method, for minimizing the sum of a convex function that consists of additive relatively smooth convex components and a proper lower semicontinuous convex regularization function over an abstract feasible set whose geometry can be captured by using the domain of a Legendre function. The PLIAG method includes many existing algorithms in the literature as special cases, such as the proximal gradient method, the Bregman proximal gradient method (also called the NoLips algorithm), the incremental aggregated gradient method, the incremental aggregated proximal method, and the proximal incremental aggregated gradient method. It also includes some novel interesting iteration schemes. First, we show that the PLIAG method is globally sublinearly convergent without requiring a growth condition, which extends the sublinear convergence result for the proximal gradient algorithm to incremental aggregated-type first-order methods. Then, by embedding a so-called Bregman distance growth condition into a descent-type lemma to construct a special Lyapunov function, we show that the PLIAG method is globally linearly convergent in terms of both function values and Bregman distances to the optimal solution set, provided that the step size is not greater than some positive constant. The convergence results derived in this paper are all established beyond the standard assumptions in the literature (i.e., without requiring the strong convexity and the Lipschitz gradient continuity of the smooth part of the objective). When specialized to many existing algorithms, our results recover or supplement their convergence results under strictly weaker conditions.
来源URL: