您的位置: 首页 > 全球经管学术 > 顶刊追踪 > 顶尖期刊 > 管理科学与工程 > Operations Research > 2019 > 1期

Bayesian Exploration for Approximate Dynamic Programming

成果类型：

Article

署名作者：

Ryzhov, Ilya O.; Mes, Martijn R. K.; Powell, Warren B.; van den Berg, Gerald

署名单位：

University System of Maryland; University of Maryland College Park; University System of Maryland; University of Maryland College Park; University of Twente; Princeton University

刊物名称：

OPERATIONS RESEARCH

ISSN/ISSBN：

0030-364X

DOI：

10.1287/opre.2018.1772

发表日期：

2019

页码：

198-214

关键词：

knowledge gradient algorithm optimization simulation selection storage

摘要：

Approximate dynamic programming (ADP) is a general methodological framework for multistage stochastic optimization problems in transportation, finance, energy, and other domains. We propose a new approach to the exploration/exploitation dilemma in ADP that leverages two important concepts from the optimal learning literature: first, we show how a Bayesian belief structure can be used to express uncertainty about the value function in ADP; second, we develop a new exploration strategy based on the concept of value of information and prove that it systematically explores the state space. An important advantage of our framework is that it can be integrated into both parametric and non-parametric value function approximations, which are widely used in practical implementations of ADP. We evaluate this strategy on a variety of distinct resource allocation problems and demonstrate that, although more computationally intensive, it is highly competitive against other exploration strategies.