Robust Dynamic Assortment Optimization in the Presence of Outlier Customers

成果类型:
Article
署名作者:
Chen, Xi; Krishnamurthy, Akshay; Wang, Yining
署名单位:
New York University; University of Texas System; University of Texas Dallas
刊物名称:
OPERATIONS RESEARCH
ISSN/ISSBN:
0030-364X
DOI:
10.1287/opre.2020.0281
发表日期:
2024
页码:
999-1015
关键词:
multiarmed bandit
摘要:
We consider the dynamic assortment optimization problem under the multinomial logit model with unknown utility parameters. The main question investigated in this paper is model mis-specification under the e-contamination model, which is a fundamental model in robust statistics and machine learning. In particular, throughout a selling horizon of length T, we assume that customers make purchases according to a well-specified underlying multinomial logit choice model in a (1 - e)-fraction of the time periods and make arbitrary purchasing decisions instead in the remaining e-fraction of the time periods. In this model, we develop a new robust online assortment optimization policy via an active-elimination strategy. We establish both upper and lower bounds on the regret, and we show that our policy is optimal up to a logarithmic factor in T when the assortment capacity is constant. We further develop a fully adaptive policy that does not require any prior knowledge of the contamination parameter e. In the case of the existence of a suboptimality gap between optimal and suboptimal products, we also established gap dependent logarithmic regret upper bounds and lower bounds in both the known-e and unknown-e cases. Our simulation study shows that our policy outperforms the existing policies based on upper confidence bounds and Thompson sampling.