Multiagent Reinforcement Learning for Constrained Markov Decision Processes by Consensus-Based PrimalDual Method
成果类型:
Article
署名作者:
Cui, Gaochen; Jia, Qing-Shan; Guan, Xiaohong
署名单位:
Tsinghua University; Xi'an Jiaotong University
刊物名称:
IEEE TRANSACTIONS ON AUTOMATIC CONTROL
ISSN/ISSBN:
0018-9286
DOI:
10.1109/TAC.2025.3534639
发表日期:
2025
页码:
4217-4224
关键词:
costs
Markov decision processes
CONVERGENCE
privacy
Power system dynamics
Multi-robot systems
Heuristic algorithms
linear programming
Lagrangian functions
cost function
Constrained Markov decision process
multiagent
primal-dual
reinforcement learning (RL)
摘要:
In this work, we consider multiagent reinforcement learning for constrained Markov decision processes and develop a consensus-based primal-dual method to solve the problem, which is model-free and with provable convergence. Compared with existing methods, our algorithm does not require the dynamic model of the system, nor ask the agents to share their local policies. The constraint is incorporated in the objective function to form the Lagrangian with the dual variables updated through the primal-dual method. The consensus-based method is applied to update the parameters of the approximate action-value functions and the dual variables in a distributed manner. The developed algorithm is shown to achieve consensus among the agents and converge to a locally optimal policy. For a certain type of constrained Markov decision processes, the method to ensure the feasibility of the final solution is developed. Numerical results show that the developed algorithm outperforms the multiagent actor-critic algorithm (Liu et al., 2018), which incorporates the constraint in the objective directly.