您的位置: 首页 > 全球经管学术 > 顶刊追踪 > 顶尖期刊 > 管理科学与工程 > Mathematics of Operations Research > 2021 > 4期

An Optimal High-Order Tensor Method for Convex Optimization

成果类型：

Article

署名作者：

Jiang, Bo; Wang, Haoyue; Zhang, Shuzhong

署名单位：

Shanghai University of Finance & Economics; Massachusetts Institute of Technology (MIT); University of Minnesota System; University of Minnesota Twin Cities; Shenzhen Research Institute of Big Data; The Chinese University of Hong Kong, Shenzhen

刊物名称：

MATHEMATICS OF OPERATIONS RESEARCH

ISSN/ISSBN：

0364-765X

DOI：

10.1287/moor.2020.1103

发表日期：

2021

页码：

1390-1412

关键词：

proximal extragradient method 1st-order methods complexity regularization

摘要：

This paper is concerned with finding an optimal algorithm for minimizing a composite convex objective function. The basic setting is that the objective is the sum of two convex functions: the first function is smooth with up to the dth-order derivative information available, and the second function is possibly nonsmooth, but its proximal tensor mappings can be computed approximately in an efficient manner. The problem is to find-in that setting-the best possible (optimal) iteration complexity for convex optimization. Along that line, for the smooth case (without the second nonsmooth part in the objective), Nesterov proposed an optimal algorithm for the first-order methods (d = 1) with iteration complexity O(1/k(2)), whereas high-order tensor algorithms (using up to general dth-order tensor information) with iteration complexity O(1/k(d+1)) were recently established. In this paper, we propose a new high-order tensor algorithm for the general composite case, with the iteration complexity of O(1/k((3d+1)/2)), which matches the lower bound for the dth-order methods as previously established and hence is optimal. Our approach is based on the accelerated hybrid proximal extragradient (A-HPE) framework proposed by Monteiro and Svaiter, where a bisection procedure is installed for each A-HPE iteration. At each bisection step, a proximal tensor subproblem is approximately solved, and the total number of bisection steps per A-HPE iteration is shown to be bounded by a logarithmic factor in the precision required.