Setwise Coordinate Descent for Dual Asynchronous Decentralized Optimization
成果类型:
Article
署名作者:
Costantini, Marina; Liakopoulos, Nikolaos; Mertikopoulos, Panayotis; Spyropoulos, Thrasyvoulos
署名单位:
IMT - Institut Mines-Telecom; EURECOM; Communaute Universite Grenoble Alpes; Institut National Polytechnique de Grenoble; Universite Grenoble Alpes (UGA); Centre National de la Recherche Scientifique (CNRS); Inria; Technical University of Crete
刊物名称:
IEEE TRANSACTIONS ON AUTOMATIC CONTROL
ISSN/ISSBN:
0018-9286
DOI:
10.1109/TAC.2025.3543463
发表日期:
2025
页码:
5349-5364
关键词:
Optimization
CONVERGENCE
STANDARDS
COSTS
training
vectors
Protocols
probability distribution
Machine learning algorithms
Machine Learning
Convex Optimization
coordinate descent (CD)
decentralized optimization
distributed machine learning
distributed optimization
multiagent optimization
optimization over networks
摘要:
In decentralized optimization over networks, synchronizing the updates of all nodes incurs significant communication overhead. Therefore, much of the recent literature has focused on designing asynchronous algorithms where nodes can activate anytime and contact a single neighbor to complete an iteration. However, most works assume that the neighbor selection is done randomly based on a fixed probability distribution, a choice that ignores the optimization landscape at the activation time. Instead, in this work we introduce an optimization-aware rule that chooses the neighbor providing the highest dual cost improvement (a quantity related to a consensus-based dualization of the problem). This scheme is related to the coordinate descent (CD) method with the Gauss-Southwell (GS) rule for coordinate updates; in our setting, however, only a subset of coordinates is accessible at each iteration (because each node can communicate only with its neighbors), so the existing literature on GS methods does not apply. To overcome this, we develop a new analytical framework for smooth and strongly convex functions that covers our new class of setwise CD algorithms-a class that applies to both decentralized and parallel distributed computing scenarios-and we show that the proposed setwise GS rule can speed up the convergence in terms of iterations by a factor equal to the size of the largest coordinate set. We analyze extensions of these algorithms that exploit the knowledge of smoothness constants when available, and otherwise propose an algorithm to estimate these constants. Finally, we validate our theoretical results through extensive simulations.
来源URL: