您的位置: 首页 > 全球经管学术 > 顶刊追踪 > 顶尖期刊 > 管理科学与工程 > IEEE Transactions on Automatic Control > 2025 > 6期

Approximate Constrained Discounted Dynamic Programming With Uniform Feasibility and Optimality

成果类型：

Article

署名作者：

Chang, Hyeong Soo

署名单位：

Sogang University

刊物名称：

IEEE TRANSACTIONS ON AUTOMATIC CONTROL

ISSN/ISSBN：

0018-9286

DOI：

10.1109/TAC.2024.3523847

发表日期：

2025

页码：

4031-4036

关键词：

costs Scheduling programming probability distribution dynamic programming Approximation algorithms trajectory Throughput Scheduling algorithms reviews Dynamic programming (DP) Markov Decision Process optimality equation uniform-optimality

摘要：

An important question about finite constrained Markov decision process (CMDP) problem is if there exists a condition under which a uniformly optimal and uniformly feasible policy exists in the set of deterministic, history-independent, and stationary policies that achieves the optimal value at all initial states and if the CMDP problem with the condition can be solved by dynamic programming (DP). This is because the crux of the unconstrained MDP theory developed by Bellman lies in the answer to the same existence question of such an optimal policy to MDP. Even if the topic of CMDP has been studied over the years, there has not been any relevant responsive work since the open question was raised about three decades ago in the literature. We establish (as some answer to this question) that any finite CMDP problem M (c) contains inherently a DP-structure in its subordinate CMDP problem )sic)(c ) induced from the parameters of M (c) and (sic)(c )is DP-solvable. We drive a policy-iteration-type algorithm for solving (sic)(c ) providing an approximate solution to M (c) or M (c) with a fixed initial state.