THE GENERALIZATION ERROR OF MAX-MARGIN LINEAR CLASSIFIERS: BENIGN OVERFITTING AND HIGH DIMENSIONAL ASYMPTOTICS IN THE OVERPARAMETRIZED REGIME

成果类型:
Article
署名作者:
Montanari, Andrea; Ruan, Feng; Sohn, Youngtak; Yan, Jun
署名单位:
Stanford University; Stanford University; Northwestern University; Massachusetts Institute of Technology (MIT)
刊物名称:
ANNALS OF STATISTICS
ISSN/ISSBN:
0090-5364
DOI:
10.1214/25-AOS2489
发表日期:
2025
页码:
822-853
关键词:
phase-transitions robust regression neural-networks CLASSIFICATION UNIVERSALITY geometry SPACE RISK
摘要:
Modern machine learning classifiers often exhibit vanishing classification error on the training set. They achieve this by learning nonlinear representations of the inputs that map the data into linearly separable classes. Motivated by these phenomena, we revisit high-dimensional maximum margin classification for linearly separable data. We consider a stylized setting in which data (y(i),x(i)), i <= n are i.i.d. with x(i)similar to N(0,Sigma)xi similar to N(0,Sigma) a p-dimensional Gaussian feature vector, and y(i)is an element of{+1,-1} a label whose distribution depends on a linear combination of the covariates . Recent universality results can be used to show that the results derived in the Gaussian setting also apply when x(i )= phi(z(i)) for standard Gaussian z(i) and phi a nonlinear featurization map. We consider the proportional asymptotics n,p ->infinity with p/n ->psi, and derive exact expressions for the limiting generalization error. We use this theory to derive two results of independent interest: (i) Sufficient conditions on (Sigma,theta(& lowast;)) for benign overfitting that are parallel to previously derived conditions in the case of linear regression; (ii) An asymptotically exact expression for the generalization error when max-margin classification is used in conjunction with feature vectors produced by random one-layer neural networks.