Nearly Dimension-Independent Sparse Linear Bandit over Small Action Spaces via Best Subset Selection
成果类型:
Article
署名作者:
Chen, Yi; Wang, Yining; Fang, Ethan X.; Wang, Zhaoran; Li, Runze
署名单位:
Hong Kong University of Science & Technology; University of Texas System; University of Texas Dallas; Duke University; Northwestern University; Pennsylvania Commonwealth System of Higher Education (PCSHE); Pennsylvania State University; Pennsylvania State University - University Park
刊物名称:
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION
ISSN/ISSBN:
0162-1459
DOI:
10.1080/01621459.2022.2108816
发表日期:
2024
页码:
246-258
关键词:
estimating individualized treatment
randomized allocation
inference
DESIGN
摘要:
We consider the stochastic contextual bandit problem under the high dimensional linear model. We focus on the case where the action space is finite and random, with each action associated with a randomly generated contextual covariate. This setting finds essential applications such as personalized recommendations, online advertisements, and personalized medicine. However, it is very challenging to balance the exploration and exploitation tradeoff. We modify the LinUCB algorithm in doubly growing epochs and estimate the parameter using the best subset selection method, which is easy to implement in practice. This approach achieves O(s root T) regret with high probability, which is nearly independent of the ambient regression model dimension d. We further attain a sharper O(s root T) regret by using the SupLinUCB framework and match the minimax lower bound of the low-dimensional linear stochastic bandit problem. Finally, we conduct extensive numerical experiments to empirically demonstrate our algorithms' applicability and robustness. Supplementary materials for this article are available online.
来源URL: