Optimality of Base-Stock Policy Under Unknown General Demand Distributions: New Methods, New Results, and Computations
成果类型:
Article; Early Access
署名作者:
Bensoussan, Alain; Sethi, Suresh P.; Thiam, Abdoulaye; Turi, Janos
署名单位:
University of Texas System; University of Texas Dallas; University of Texas System; University of Texas Dallas; Allen University; University of Texas System; University of Texas Dallas
刊物名称:
PRODUCTION AND OPERATIONS MANAGEMENT
ISSN/ISSBN:
1059-1478
DOI:
10.1177/10591478251345135
发表日期:
2025
关键词:
Bayesian learning
dynamic programming
Bellman equation
Unnormalized Probability
Base-Stock Policy
computational methods
摘要:
This article advances the literature on the optimality of the base-stock policy for a general demand distribution and a general prior belief, which we update as we observe realized demands, assumed to be continuous, independent and identically distributed, random variables. The value function depends on the belief, so the functional Bellman equation is infinite-dimensional. Significantly, in contrast with traditional approaches, we derive a functional equation for the derivative of the value function with respect to the inventory level, which provides a direct approach to computing the optimal base-stock policy. In two well-known cases, we characterize how the base-stock level depends on the belief, and we implement the approach to compute the optimal base-stock level. In the first case of conjugate probabilities, the infinite-dimensional state reduces to a finite-dimensional sufficient statistic. That allows us to solve two numerical examples of exponential and Weibull demands. Moreover, for the exponential demand example, we compare the optimal cost with the costs achieved by two myopic policies with three guesses of the initial belief. We find that the optimal policy improves upon the first myopic policy by 12.6%, 13.0%, and 9.2%, and upon the second myopic policy by 28.7%, 26.9%, and 27.7%. The second case considers the demand to come from one of two possible distributions, but we do not know which. Here, we derive a functional equation in one hyperparameter expressing the ratio of the weights assigned to the two distributions. We then develop an approximation scheme to solve it, show that it converges, and implement it numerically to obtain the optimal base-stock levels over time.
来源URL: