您的位置: 首页 > 全球经管学术 > 顶刊追踪 > 顶尖期刊 > 管理科学与工程 > Operations Research > 2018 > 4期

On Incomplete Learning and Certainty-Equivalence Control

成果类型：

Article

署名作者：

Keskin, N. Bora; Zeevi, Assaf

署名单位：

Duke University; Columbia University

刊物名称：

OPERATIONS RESEARCH

ISSN/ISSBN：

0030-364X

DOI：

10.1287/opre.2017.1713

发表日期：

2018

页码：

1136-1167

关键词：

stochastic regression-models least-squares estimation asymptotic properties adaptive-control Bandit problem markov-chains approximation experimentation identification allocation

摘要：

We consider a dynamic learning problem where a decision maker sequentially selects a control and observes a response variable that depends on chosen control and an unknown sensitivity parameter. After every observation, the decision maker updates his or her estimate of the unknown parameter and uses a certainty-equivalence decision rule to determine subsequent controls based on this estimate. We show that under this certainty-equivalence learning policy the parameter estimates converge with positive probability to an uninformative fixed point that can differ from the true value of the unknown parameter; a phenomenon that will be referred to as incomplete learning. In stark contrast, it will be shown that this certainty-equivalence policy may avoid incomplete learning if the parameter value of interest drifts away from the uninformative fixed point at a critical rate. Finally, we prove that one can adaptively limit the learning memory to improve the accuracy of the certainty-equivalence policy in both static (estimation), as well as slowly varying (tracking) environments, without relying on forced exploration.