Contextual Dynamic Pricing with Strategic Buyers

成果类型:
Article
署名作者:
Liu, Pangpang; Yang, Zhuoran; Wang, Zhaoran; Sun, Will Wei
署名单位:
Purdue University System; Purdue University; Yale University; Northwestern University
刊物名称:
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION
ISSN/ISSBN:
0162-1459
DOI:
10.1080/01621459.2024.2370613
发表日期:
2025
页码:
896-908
关键词:
摘要:
Personalized pricing, which involves tailoring prices based on individual characteristics, is commonly used by firms to implement a consumer-specific pricing policy. In this process, buyers can also strategically manipulate their feature data to obtain a lower price, incurring certain manipulation costs. Such strategic behavior can hinder firms from maximizing their profits. In this article, we study the contextual dynamic pricing problem with strategic buyers. The seller does not observe the buyer's true feature, but a manipulated feature according to buyers' strategic behavior. In addition, the seller does not observe the buyers' valuation of the product, but only a binary response indicating whether a sale happens or not. Recognizing these challenges, we propose a strategic dynamic pricing policy that incorporates the buyers' strategic behavior into the online learning to maximize the seller's cumulative revenue. We first prove that existing nonstrategic pricing policies that neglect the buyers' strategic behavior result in a linear Omega(T) regret with T the total time horizon, indicating that these policies are not better than a random pricing policy. We then establish an O(root T) regret upper bound of our proposed policy and an Omega(root T) regret lower bound for any pricing policy within our problem setting. This underscores the rate optimality of our policy. Importantly, our policy is not a mere amalgamation of existing dynamic pricing policies and strategic behavior handling algorithms. Our policy can also accommodate the scenario when the marginal cost of manipulation is unknown in advance. To account for it, we simultaneously estimate the valuation parameter and the cost parameter in the online pricing policy, which is shown to also achieve an Omega(root T) regret bound. Extensive experiments support our theoretical developments and demonstrate the superior performance of our policy compared to other pricing policies that are unaware of the strategic behaviors. Supplementary materials for this article are available online, including a standardized description of the materials available for reproducing the work.