Approximate Dynamic Programming via a Smoothed Linear Program
成果类型:
Article
署名作者:
Desai, Vijay V.; Farias, Vivek F.; Moallemi, Ciamac C.
署名单位:
Columbia University; Massachusetts Institute of Technology (MIT); Columbia University
刊物名称:
OPERATIONS RESEARCH
ISSN/ISSBN:
0030-364X
DOI:
10.1287/opre.1120.1044
发表日期:
2012
页码:
655-674
关键词:
摘要:
We present a novel linear program for the approximation of the dynamic programming cost-to-go function in high-dimensional stochastic control problems. LP approaches to approximate DP have typically relied on a natural projection of a well-studied linear program for exact dynamic programming. Such programs restrict attention to approximations that are lower bounds to the optimal cost-to-go function. Our program-the smoothed approximate linear program-is distinct from such approaches and relaxes the restriction to lower bounding approximations in an appropriate fashion while remaining computationally tractable. Doing so appears to have several advantages: First, we demonstrate bounds on the quality of approximation to the optimal cost-to-go function afforded by our approach. These bounds are, in general, no worse than those available for extant LP approaches and for specific problem instances can be shown to be arbitrarily stronger. Second, experiments with our approach on a pair of challenging problems (the game of Tetris and a queueing network control problem) show that the approach outperforms the existing LP approach (which has previously been shown to be competitive with several ADP algorithms) by a substantial margin.