您的位置: 首页 > 全球经管学术 > 顶刊追踪 > 顶尖期刊 > 管理科学与工程 > IEEE Transactions on Automatic Control > 2025 > 4期

Accelerated Distributed Stochastic Nonconvex Optimization Over Time-Varying Directed Networks

成果类型：

Article

署名作者：

Chen, Yiyue; Hashemi, Abolfazl; Vikalo, Haris

署名单位：

University of Texas System; University of Texas Austin; Purdue University System; Purdue University

刊物名称：

IEEE TRANSACTIONS ON AUTOMATIC CONTROL

ISSN/ISSBN：

0018-9286

DOI：

10.1109/TAC.2024.3479888

发表日期：

2025

页码：

2196-2211

关键词：

STOCHASTIC PROCESSES optimization Signal processing algorithms CONVERGENCE computational modeling Heuristic algorithms Complexity theory linear programming Distance learning Directed graphs Decentralized nonconvex optimization stochastic nonconvex optimization time-varying directed network

摘要：

Distributed stochastic nonconvex optimization problems have recently received attention due to the growing interest of signal processing, computer vision, and natural language processing communities in applications deployed over distributed learning systems (e.g., federated learning). We study the setting where the data is distributed across the nodes of a time-varying directed network, a topology suitable for modeling dynamic networks experiencing communication delays and straggler effects. The network nodes, which can access only their local objectives and query a stochastic first-order oracle to obtain gradient estimates, collaborate to minimize a global objective function by exchanging messages with their neighbors. We propose an algorithm, novel to this setting, that leverages stochastic gradient descent with momentum and gradient tracking to solve distributed nonconvex optimization problems over time-varying networks. To analyze the algorithm, we tackle the challenges that arise when analyzing dynamic network systems that communicate gradient acceleration components. We prove that the algorithm's oracle complexity is O(1/epsilon(1.5)), and that under Polyak-Lojasiewicz condition the algorithm converges linearly to a steady error state. The proposed scheme is tested on several learning tasks: a nonconvex logistic regression experiment on the MNIST dataset, an image classification task on the CIFAR-10 dataset, and an NLP classification test on the IMDB dataset. We further present numerical simulations with an objective that satisfies the PL condition. The results demonstrate superior performance of the proposed framework compared to the existing related methods.