作者:Muller, A
摘要:The present work deals with the comparison of (discrete time) Markov decision processes (MDPs), which differ only in their transition probabilities. We show that the optimal value function of an MDP is monotone with respect to appropriately defined stochastic order relations. We also find conditions for continuity with respect to suitable probability metrics. The results are applied to some well-known examples, including inventory control and optimal stopping.
作者:Goldfarb, D; Jin, ZY; Orlin, JB
作者单位:Massachusetts Institute of Technology (MIT)
摘要:This paper presents two new combinatorial algorithms for the generalized circulation problem. After an initial step in which all flow-generating cycles are canceled and excesses are created, both algorithms bring these excesses to the sink via highest-gain augmenting paths. Scaling is applied to the fixed amount of flow that the algorithms attempt to send to the sink, and both node and are excesses are used. The algorithms have worst-case complexities of O(m(2)(m + n log n) log B), where n is ...