Bayesian Inventory Control: Accelerated Demand Learning via Exploration Boosts
成果类型:
Article
署名作者:
Chuang, Ya-Tang; Kim, Michael Jong
署名单位:
National Cheng Kung University; University of British Columbia
刊物名称:
OPERATIONS RESEARCH
ISSN/ISSBN:
0030-364X
DOI:
10.1287/opre.2023.2467
发表日期:
2023
页码:
1515-1529
关键词:
Inventory Management
Bayesian learning
exploration versus exploitation trade-off
Bayesian dynamic programming
摘要:
We investigate Bayesian inventory control problems where parameters of the demand distribution are not known a priori but need to be learned using right-censored sales data. A Bayesian framework is adopted for demand learning, and the corresponding control problem is analyzed via Bayesian dynamic programming (BDP). In the Bayesian setting, it is known that the BDP-optimal decision is equal to the sum of the myopic-optimal decision plus a nonnegative exploration boost. The goal of this paper is to (i) identify those applications in which adding an exploration boost is important and (ii) characterize the form of the exploration boost. In contrast to recent research that suggests that ignoring the exploration boost (i.e., adopting the myopic policy) can perform reasonably well in certain settings, we show that for applications with moderate time horizons and high parameter uncertainty, the optimality gap between the myopic policy and the BDP-optimal policy can be arbitrarily large and in particular, grows in proportion to the posterior index of dispersion of the unknown mean demand. With regard to characterizing the form of the BDP-optimal exploration boost, we prove that the exploration boost is also proportional to the posterior index of dispersion of the unknown mean demand. This characterization expresses in clear terms the way in which the statistical learning and inventory control are jointly optimized; when there is a high degree of parameter uncertainty (encoded as a large posterior index of dispersion), inventory decisions are boosted to induce a higher chance of observing more sales data so as to more quickly resolve statistical uncertainty (i.e., accelerated demand learning), and to not do so will necessarily lead to poor performance.
来源URL: