Pretraining and the lasso
成果类型:
Article; Early Access
署名作者:
Craig, Erin; Pilanci, Mert; Le Menestrel, Thomas; Narasimhan, Balasubramanian; Rivas, Manuel A.; Gullaksen, Stein-Erik; Dehghannasiri, Roozbeh; Salzman, Julia; Taylor, Jonathan; Tibshirani, Robert
署名单位:
Stanford University; Stanford University; Stanford University; Stanford University; University of Bergen; Haukeland University Hospital; University of Bergen; Stanford University
刊物名称:
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY
ISSN/ISSBN:
1369-7412
DOI:
10.1093/jrsssb/qkaf050
发表日期:
2025
关键词:
摘要:
Pre-training is a powerful paradigm in machine learning to pass information across models. For example, suppose one has a modest-sized dataset of images of cats and dogs and plans to fit a deep neural network to classify them. With pre-training, we start with a neural network trained on a large corpus of images of not just cats and dogs but hundreds of classes. We fix all network weights except the top layer(s) and fine tune on our dataset. This often results in dramatically better performance than training solely on our dataset. Here, we ask: 'Can pre-training help the lasso?'. We propose a framework where the lasso is fit on a large dataset and then fine-tuned on a smaller dataset. The latter can be a subset of the original, or have a different but related outcome. This framework has a wide variety of applications, including stratified and multi-response models. In the stratified model setting, lasso pre-training first estimates coefficients common to all groups, then estimates group-specific coefficients during fine-tuning. Under appropriate assumptions, support recovery of the common coefficients is superior to the usual lasso trained on individual groups. This separate identification of common and individual coefficients also aids scientific understanding.
来源URL: