您的位置: 首页 > 全球经管学术 > 顶刊追踪 > 顶尖期刊 > 管理科学与工程 > Management Science > 2021 > 10期

Multimodal Dynamic Pricing

成果类型：

Article

署名作者：

Wang, Yining; Chen, Boxiao; Simchi-Levi, David

署名单位：

State University System of Florida; University of Florida; University of Illinois System; University of Illinois Chicago; University of Illinois Chicago Hospital; Massachusetts Institute of Technology (MIT); Massachusetts Institute of Technology (MIT)

刊物名称：

MANAGEMENT SCIENCE

ISSN/ISSBN：

0025-1909

DOI：

10.1287/mnsc.2020.3819

发表日期：

2021

页码：

6136-6152

关键词：

multimodal reward function Dynamic pricing nonparametric learning asymptotic analyses

摘要：

We consider a single product dynamic pricing with demand learning. The candidate prices belong to a wide range of a price interval; the modeling of the demand functions is nonparametric in nature, imposing only smoothness regularity conditions. One important aspect of our model is the possibility of the expected reward function to be nonconcave and indeed multimodal, which leads to many conceptual and technical challenges. Our proposed algorithm is inspired by both the Upper-Confidence-Bound algorithm for multiarmed bandit and the Optimism-in-the-Face-of-Uncertainty principle arising from linear contextual bandits. The multiarmed bandit formulation arises from local-bin approximation of an unknown continuous demand function, and the linear contextual bandit formulation is then applied to obtain more accurate local polynomial approximators within each bin. Through rigorous regret analysis, we demonstrate that our proposed algorithm achieves optimal worst-case regret over a wide range of smooth function classes. More specifically, for k-times smooth functions and T selling periods, the regret of our proposed algorithm is (O) over tilde (T(K+1)/(2K+1)), which is shown to be optimal via the development of information theoretical lower bounds. We also show that in special cases, such as strongly concave or infinitely smooth reward functions, our algorithm achieves an O(root T) regret, matching optimal regret established in previous works. Finally, we present computational results that verify the effectiveness of our method in numerical simulations.

来源URL：

访问原文