Bias and variance approximation in value function estimates
成果类型:
Article
署名作者:
Mannor, Shie; Simester, Duncan; Sun, Peng; Tsitsiklis, John N.
署名单位:
McGill University; Massachusetts Institute of Technology (MIT); Duke University; Massachusetts Institute of Technology (MIT)
刊物名称:
MANAGEMENT SCIENCE
ISSN/ISSBN:
0025-1909
DOI:
10.1287/mnsc.1060.0614
发表日期:
2007
页码:
308-322
关键词:
value function
Confidence Interval
variance
BIAS
摘要:
We consider a finite-state, finite-action, infinite-horizon, discounted reward Markov decision process and study the bias and variance in the value function estimates that result from empirical estimates of the model parameters. We provide closed-form approximations for the bias and variance, which can then be used to derive confidence intervals around the value function estimates. We illustrate and validate our findings using a large database describing the transaction and mailing histories for customers of a mail-order catalog firm.