A DUALITY FRAMEWORK FOR ANALYZING RANDOM FEATURE AND TWO-LAYER NEURAL NETWORKS

成果类型:
Article
署名作者:
Chen, Hongrui; Long, Jihao; Wu, Lei
署名单位:
Stanford University; Peking University
刊物名称:
ANNALS OF STATISTICS
ISSN/ISSBN:
0090-5364
DOI:
10.1214/25-AOS2492
发表日期:
2025
页码:
1044-1067
关键词:
Bounds approximation rates complexity algorithm
摘要:
We consider the problem of learning functions within the Fp,pi and Barron spaces, which play crucial roles in understanding random feature models (RFMs), two-layer neural networks as well as kernel methods. Leveraging tools from information-based complexity (IBC), we establish a dual equivalence between approximation and estimation and then apply it to study the learning of the preceding function spaces. The duality allows us to focus on the more tractable problem between approximation and estimation. To showcase the efficacy of our duality framework, we delve into two important but under-explored problems: (1) Random feature learning beyond kernel regime. We derive sharp bounds for learning Fp,pi using RFMs. Notably, the learning is efficient without the curse of dimensionality for p>1. This underscores the extended applicability of RFMs beyond the traditional kernel regime, since Fp,pi with p<2 is strictly larger than the corresponding reproducing kernel Hilbert space (RKHS) where p=2. (2) The L infinity learning of RKHS. We establish sharp, spectrum-dependent characterizations for the convergence of L infinity learning error in both noiseless and noisy settings. Surprisingly, we show that popular kernel ridge regression can achieve near-optimal performance in L infinity learning, despite it primarily minimizing square loss. To establish the aforementioned duality, we introduce a type of IBC, termed I-complexity, to measure the size of a function class. Notably, I-complexity offers a tight characterization of learning in noiseless settings, yields lower bounds comparable to Le Cam's in noisy settings, and is versatile in deriving upper bounds. We believe that our duality framework holds potential for broad application in learning analysis across more scenarios.
来源URL: