Rate-Optimal Online Learning for Dynamic Assortment Selection with Positioning

成果类型:
Article; Early Access
署名作者:
Luo, Yiyun; Sun, Will Wei; Liu, Yufeng
署名单位:
Shanghai University of Finance & Economics; Purdue University System; Purdue University; University of North Carolina; University of North Carolina Chapel Hill; University of North Carolina School of Medicine
刊物名称:
OPERATIONS RESEARCH
ISSN/ISSBN:
0030-364X
DOI:
10.1287/opre.2024.1556
发表日期:
2025
关键词:
Optimization MODEL
摘要:
In online retailing, the seller aims to offer assortment of items with maximized revenue. We introduce a new online learning problem called dynamic assortment selection with positioning (DAP) that additionally learns the optimal positioning within the assortment. Specifically, the customers make purchases based on the item attractiveness as the product of the position effect and unknown preference parameter through a multinomial logit choice model. We first demonstrate that any assortment-only algorithm that neglects position effects results in linear regrets. To address this gap, we propose the truncated linear regression upper confidence bound (TLR-UCB) policy. TLR-UCB utilizes a novel geometric linear bandit-type feedback structure for UCB construction under random and adaptive position effects. In addition, TLR-UCB conducts well-designed truncations before applying linear regression to handle conditional geometric responses. In theory, we establish a regret upper bound of O(T1=2) for TLR-UCB, matching our derived bound. Moreover, we develop an explore-in-TLR-UCB (EI-TLR) policy to tackle unknown position effects. It first conducts a joint learning procedure to estimate unknown preferences and position effects, and then implements a generalized TLR-UCB procedure driven by estimated position effects. Extensive experiments demonstrate the superior performance of TLR-UCB and EI-TLR over other benchmark policies.
来源URL: