Approximate Linear Programming for Average Cost MDPs

成果类型:
Article
署名作者:
Veatch, Michael H.
刊物名称:
MATHEMATICS OF OPERATIONS RESEARCH
ISSN/ISSBN:
0364-765X
DOI:
10.1287/moor.1120.0574
发表日期:
2013
页码:
535-544
关键词:
摘要:
We consider the linear programming approach to approximate dynamic programming with an average cost objective and a finite state space. Using a Lagrangian form of the linear program (LP), the average cost error is shown to be a multiple of the best fit differential cost error. This result is analogous to previous error bounds for a discounted cost objective. Second, bounds are derived for average cost error and performance of the policy generated from the LP that involve the mixing time of the Markov decision process (MDP) under this policy or the optimal policy. These results improve on a previous performance bound involving mixing times.