Policy Optimization Using Semiparametric Models for Dynamic Pricing
成果类型:
Article
署名作者:
Fan, Jianqing; Guo, Yongyi; Yu, Mengxin
署名单位:
Princeton University
刊物名称:
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION
ISSN/ISSBN:
0162-1459
DOI:
10.1080/01621459.2022.2128359
发表日期:
2024
页码:
552-564
关键词:
strong uniform consistency
generalized linear-models
kernel regression
Optimal Rates
CONVERGENCE
management
analytics
weak
摘要:
In this article, we study the contextual dynamic pricing problem where the market value of a product is linear in its observed features plus some market noise. Products are sold one at a time, and only a binary response indicating success or failure of a sale is observed. Our model setting is similar to the work by Javanmard and Nazerzadeh except that we expand the demand curve to a semiparametric model and learn dynamically both parametric and nonparametric components. We propose a dynamic statistical learning and decision making policy that minimizes regret (maximizes revenue) by combining semiparametric estimation for a generalized linear model with unknown link and online decision making. Under mild conditions, for a market noise cdf F(.) with mth order derivative (m >= 2), our policy achieves a regret upper bound of (Q) over tilde (d)(T2m+1/4m-1), where T is the time horizon and (Q) over tilde (d) is the order hiding logarithmic terms and the feature dimension d. The upper bound is further reduced to (Q) over tilde (d)(root T) if F is super smooth. These upper bounds are close to Omega(root T), the lower bound where F belongs to a parametric class. We further generalize these results to the case with dynamic dependent product features under the strong mixing condition. Supplementary materials for this article are available online.
来源URL: