您的位置: 首页 > 全球经管学术 > 顶刊追踪 > 顶尖期刊 > 管理科学与工程 > Mathematics of Operations Research > 2008 > 4期

A Learning Algorithm for Risk-Sensitive Cost

成果类型：

Article

署名作者：

Basu, Arnab; Bhattacharyya, Tirthankar; Borkar, Vivek S.

署名单位：

Indian Institute of Management (IIM System); Indian Institute of Management Bangalore; Indian Institute of Science (IISC) - Bangalore; Tata Institute of Fundamental Research (TIFR)

刊物名称：

MATHEMATICS OF OPERATIONS RESEARCH

ISSN/ISSBN：

0364-765X

DOI：

10.1287/moor.1080.0324

发表日期：

2008

页码：

880-898

关键词：

large deviations SPECTRAL THEORY DYNAMICS

摘要：

A linear function approximation-based reinforcement learning algorithm is proposed for Markov decision processes with infinite horizon risk-sensitive cost. Its convergence is proved using the o.d.e. method for stochastic approximation. The scheme is also extended to continuous state space processes.

来源URL：

访问原文