Distributed Stochastic Optimization Under a General Variance Condition
成果类型:
Article
署名作者:
Huang, Kun; Li, Xiao; Pu, Shi
署名单位:
The Chinese University of Hong Kong, Shenzhen
刊物名称:
IEEE TRANSACTIONS ON AUTOMATIC CONTROL
ISSN/ISSBN:
0018-9286
DOI:
10.1109/TAC.2024.3393169
发表日期:
2024
页码:
6105-6120
关键词:
Optimization
linear programming
Distributed databases
gradient methods
CONVERGENCE
Complexity theory
Particle measurements
distributed optimization
Nonconvex Optimization
stochastic optimization
摘要:
Distributed stochastic optimization has drawn great attention recently due to its effectiveness in solving large-scale machine learning problems. Although numerous algorithms have been proposed and successfully applied to general practical problems, their theoretical guarantees mainly rely on certain boundedness conditions on the stochastic gradients, varying from uniform boundedness to the relaxed growth condition. In addition, how to characterize the data heterogeneity among the agents and its impacts on the algorithmic performance remains challenging. In light of such motivations, we revisit the classical federated averaging algorithm (McMahan et al., 2017) as well as the more recent SCAFFOLD method (Karimireddy et al., 2020) for solving the distributed stochastic optimization problem and establish the convergence results under only a mild variance condition on the stochastic gradients for smooth nonconvex objective functions. Almost sure convergence to a stationary point is also established under the condition. Moreover, we discuss a more informative measurement for data heterogeneity as well as its implications.