NONPARAMETRIC LEARNING FOR IMPULSE CONTROL PROBLEMS-EXPLORATION VS. EXPLOITATION
成果类型:
Article
署名作者:
Christensen, Soren; Strauch, Claudia
署名单位:
University of Kiel; Aarhus University
刊物名称:
ANNALS OF APPLIED PROBABILITY
ISSN/ISSBN:
1050-5164
DOI:
10.1214/22-AAP1849
发表日期:
2023
页码:
1369-1387
关键词:
Ambiguity
MODEL
摘要:
One of the fundamental assumptions in stochastic control of continuous time processes is that the dynamics of the underlying (diffusion) process is known. This is, however, usually obviously not fulfilled in practice. On the other hand, over the last decades, a rich theory for nonparametric estimation of the drift (and volatility) for continuous time processes has been developed. The aim of this paper is bringing together techniques from stochastic control with methods from statistics for stochastic processes to find a way to both learn the dynamics of the underlying process and control in a reasonable way at the same time. More precisely, we study a long-term average impulse con-trol problem, a stochastic version of the classical Faustmann timber harvest-ing problem. One of the problems that immediately arises is an exploration -exploitation dilemma as is well known for problems in machine learning. We propose a way to deal with this issue by combining exploration and exploita-tion periods in a suitable way. Our main finding is that this construction can be based on the rates of convergence of estimators for the invariant density. Using this, we obtain that the average cumulated regret is of uniform order O(T (-1/3)).