Multitask Learning and Bandits via Robust Statistics
成果类型:
Article
署名作者:
Xu, Kan; Bastani, Hamsa
署名单位:
Arizona State University; Arizona State University-Tempe; University of Pennsylvania
刊物名称:
MANAGEMENT SCIENCE
ISSN/ISSBN:
0025-1909
DOI:
10.1287/mnsc.2022.00490
发表日期:
2025
关键词:
multitask learning
Transfer Learning
Robust Statistics
Lasso
contextual bandits
摘要:
Decision makers often simultaneously face many related but heterogeneous learning problems. For instance, a large retailer may wish to learn product demand at different stores to solve pricing or inventory problems, making it desirable to learn jointly for stores serving similar customers; alternatively, a hospital network may wish to learn patient risk at different providers to allocate personalized interventions, making it desirable to learn jointly for hospitals serving similar patient populations. Motivated by real data sets, we study a natural setting where the unknown parameter in each learning instance can be decomposed into a shared global parameter plus a sparse instance-specific term. We propose a novel two-stage multitask learning estimator that exploits this structure in a sampleefficient way, using a unique combination of robust statistics (to learn across similar instances) and LASSO regression (to debias the results). Our estimator yields improved sample complexity bounds in the feature dimension d relative to commonly employed estimators; this improvement is exponential for data-poor instances, which benefit the most from multitask learning. We illustrate the utility of these results for online learning by embedding our multitask estimator within simultaneous contextual bandit algorithms. We specify a dynamic calibration of our estimator to appropriately balance the bias-variance trade-off over time, improving the resulting regret bounds in the context dimension d. Finally, we illustrate the value of our approach on synthetic and real data sets.