您的位置: 首页 > 全球经管学术 > 顶刊追踪 > 顶尖期刊 > 管理科学与工程 > Mathematics of Operations Research > 2016 > 4期

Optimality of Quasi-Open-Loop Policies for Discounted Semi-Markov Decision Processes

成果类型：

Article

署名作者：

Adelman, Daniel; Mancini, Angelo J.

署名单位：

University of Chicago

刊物名称：

MATHEMATICS OF OPERATIONS RESEARCH

ISSN/ISSBN：

0364-765X

DOI：

10.1287/moor.2015.0775

发表日期：

2016

页码：

1222-1247

关键词：

Optimization EXISTENCE algorithm

摘要：

Quasi-open-loop policies consist of sequences of Markovian decision rules that are insensitive to one component of the state space. Given a semi-Markov decision process (SMDP), we distinguish between exogenous and endogenous state components as follows: (i) the decision-maker's actions do not impact the evolution of an exogenous state component, and (ii) between consecutive decision epochs, the exogenous and endogenous state components are conditionally independent given the decision-maker's latest action. For simplicity, we consider an SMDP with one exogenous and one endogenous state component. When transition times between epochs are conditionally independent of the exogenous state given the most recent action, and the exogenous component is a multiplicative compound Poisson process, we provide an almost-everywhere condition on the reward function sufficient for the optimality of a quasi-open-loop policy. After adjusting the discount factor to account for the statistical properties of the exogenous state process, obtaining this policy amounts to solving a reduced SMDP in which the exogenous state is static. Depending on the relationship between the structure of the exogenous state process and the shape of the reward function, we can replace the almost-everywhere condition with one that applies only in expectation. Quasi-open-loop optimality holds even if the times between decision epochs depend on the Poisson process underlying the exogenous state component, and/or the Poisson process is replaced with a generic counting process.