A Learning Algorithm for Risk-Sensitive Cost
成果类型:
Article
署名作者:
Basu, Arnab; Bhattacharyya, Tirthankar; Borkar, Vivek S.
署名单位:
Indian Institute of Management (IIM System); Indian Institute of Management Bangalore; Indian Institute of Science (IISC) - Bangalore; Tata Institute of Fundamental Research (TIFR)
刊物名称:
MATHEMATICS OF OPERATIONS RESEARCH
ISSN/ISSBN:
0364-765X
DOI:
10.1287/moor.1080.0324
发表日期:
2008
页码:
880-898
关键词:
large deviations
SPECTRAL THEORY
DYNAMICS
摘要:
A linear function approximation-based reinforcement learning algorithm is proposed for Markov decision processes with infinite horizon risk-sensitive cost. Its convergence is proved using the o.d.e. method for stochastic approximation. The scheme is also extended to continuous state space processes.
来源URL: