A cost-shaping linear program for average-cost approximate dynamic programming with performance guarantees

成果类型:
Article
署名作者:
de Farias, Daniela Pucci; Van Roy, Benjamin
署名单位:
Massachusetts Institute of Technology (MIT); Stanford University; Stanford University
刊物名称:
MATHEMATICS OF OPERATIONS RESEARCH
ISSN/ISSBN:
0364-765X
DOI:
10.1287/moor.1060.0208
发表日期:
2006
页码:
597-620
关键词:
DECISION-PROCESSES queuing-networks Iteration
摘要:
We introduce a new algorithm based on linear programming for optimization of average-cost Markov decision processes (MDPs). The algorithm approximates the differential cost function of a perturbed MDP via a linear combination of basis functions. We establish a bound on the performance of the resulting policy that scales gracefully with the number of states without imposing the strong Lyapunov condition required by its counterpart in de Farias and Van Roy (de Farias, D. R, B. Van Roy. 2003. The linear programming approach to approximate dynamic programming. Oper Res. 51(6) 850-865]. We investigate implications of this result in the context of a queueing control problem.
来源URL: