A two-state partially observable Markov decision process with uniformly distributed observations
成果类型:
Article
署名作者:
GrosfeldNir, A
署名单位:
Northwestern University
刊物名称:
OPERATIONS RESEARCH
ISSN/ISSBN:
0030-364X
DOI:
10.1287/opre.44.3.458
发表日期:
1996
页码:
458-463
关键词:
摘要:
A controller observes a production system periodically, over time. If the system is in the GOOD state during one period, there is a constant probability that it will deteriorate and be in the BAD state during the next period (and remains there). The true state of the system is unobservable and on only be inferred from observations (quality of output). Two actions are available: CONTINUE or REPLACE (for a fixed cost). The objective is to maximize the expected discounted value of the total future income. For both the finite- and infinite-horizon problems, the optimal policy is of a CONTROL LIMIT (CLT) type: continue if the good state probability exceeds the CLT, and replace otherwise. The computation of the CLT involves a functional equation. An analytical solution for this equation is as yet unknown. For uniformly distributed observations we obtain the infinite-horizon CLT analytically. We also show that the finite horizon CLTs, as a function of the time remaining, are not necessarily monotone, which is counterintuitive.