Tighter Analysis for Decentralized Stochastic Gradient Method: Impact of Data Homogeneity
成果类型:
Article
署名作者:
Li, Qiang; Wai, Hoi-To
署名单位:
Chinese University of Hong Kong
刊物名称:
IEEE TRANSACTIONS ON AUTOMATIC CONTROL
ISSN/ISSBN:
0018-9286
DOI:
10.1109/TAC.2025.3545551
发表日期:
2025
页码:
4703-4718
关键词:
Transient analysis
CONVERGENCE
optimization
Stochastic processes
linear programming
Machine learning algorithms
Approximation algorithms
gradient methods
Distributed databases
vectors
Convex Optimization
decentralized stochastic gradient descent (DSGD)
distributed optimization
Nonconvex Optimization
transient time
摘要:
This article studies the effect of data homogeneity on multiagent stochastic optimization. We consider the decentralized stochastic gradient (DSGD) algorithm and perform a refined convergence analysis. Our analysis is explicit on the similarity between Hessian matrices of local objective functions, which captures the degree of data homogeneity. We illustrate the impact of our analysis through studying the transient time, defined as the minimum number of iterations required for a distributed algorithm to achieve comparable performance as its centralized counterpart. When the local objective functions have similar Hessian, the transient time of DSGD can be as small as O(n(2/3)/rho(8/3)) for smooth (possibly nonconvex) objective functions, O(root n/rho) for strongly convex objective functions, where n is the number of agents and rho is the spectral gap of graph. These findings provide a theoretical justification for the empirical success of DSGD. Our analysis relies on a novel observation with higher order Taylor approximation for gradient maps that can be of independent interest. Numerical simulations validate our findings.