Maximum-Stopping-Value Policies in Finite Markov Population Decision Chains
成果类型:
Article
署名作者:
Eaves, B. Curtis; Veinott, Arthur F., Jr.
署名单位:
Stanford University; Stanford University
刊物名称:
MATHEMATICS OF OPERATIONS RESEARCH
ISSN/ISSBN:
0364-765X
DOI:
10.1287/moor.2013.0638
发表日期:
2014
页码:
597-606
关键词:
sequential decisions
optimality criteria
摘要:
This paper considers infinite-horizon finite state-and-action Markov population decision chains (MPDCs) in which the goal is to find a stationary stopping policy with maximum stopping value, that is, with maximum value over all deterministic Markov stopping policies. A policy is stopping if the resulting expected population size in a period diminishes to zero as the period converges to infinity. The paper shows that the following are equivalent: (a) there is a stationary maximum-stopping value policy; (b) the maximum stopping value is finite; (c) there is a stopping policy and an excessive point of the optimal return operator; and (d) the maximum stopping value is the least excessive (resp., fixed) point of the optimal return operator. The paper shows how to use linear programming, policy improvement and successive approximations to solve the problem. The problem arises in stopping a Markov chain, as in optimally eradicating a pest or disease. The problem is one of two key subproblems used repeatedly to find Blackwell and Cesaro-overtaking optimal policies in finite Markov decision chains (MDCs), both of which, with their generalizations, have a vast number of applications. The problem for MDCs and/or MPDCs has been studied under various conditions, and over the years, by many investigators including Bellman, Bertsekas, Blackwell, Dantzig, Denardo, d'Epenoux, Derman, Dynkin, Eaton, Erickson, Ford, Hordijk, Howard, Kallenberg, Manne, O'Sullivan, Rothblum, Shoemaker, Strauch, Tsitsiklis, Veinott, Wagner, and Zadeh.