您的位置: 首页 > 全球经管学术 > 顶刊追踪 > 顶尖期刊 > 管理科学与工程 > IEEE Transactions on Automatic Control > 2025 > 6期

Multiagent Reinforcement Learning for Constrained Markov Decision Processes by Consensus-Based PrimalDual Method

成果类型：

Article

署名作者：

Cui, Gaochen; Jia, Qing-Shan; Guan, Xiaohong

署名单位：

Tsinghua University; Xi'an Jiaotong University

刊物名称：

IEEE TRANSACTIONS ON AUTOMATIC CONTROL

ISSN/ISSBN：

0018-9286

DOI：

10.1109/TAC.2025.3534639

发表日期：

2025

页码：

4217-4224

关键词：

costs Markov decision processes CONVERGENCE privacy Power system dynamics Multi-robot systems Heuristic algorithms linear programming Lagrangian functions cost function Constrained Markov decision process multiagent primal-dual reinforcement learning (RL)

摘要：

In this work, we consider multiagent reinforcement learning for constrained Markov decision processes and develop a consensus-based primal-dual method to solve the problem, which is model-free and with provable convergence. Compared with existing methods, our algorithm does not require the dynamic model of the system, nor ask the agents to share their local policies. The constraint is incorporated in the objective function to form the Lagrangian with the dual variables updated through the primal-dual method. The consensus-based method is applied to update the parameters of the approximate action-value functions and the dual variables in a distributed manner. The developed algorithm is shown to achieve consensus among the agents and converge to a locally optimal policy. For a certain type of constrained Markov decision processes, the method to ensure the feasibility of the final solution is developed. Numerical results show that the developed algorithm outperforms the multiagent actor-critic algorithm (Liu et al., 2018), which incorporates the constraint in the objective directly.