EXACT LOWER BOUNDS FOR THE AGNOSTIC PROBABLY-APPROXIMATELY-CORRECT (PAC) MACHINE LEARNING MODEL

成果类型:
Article
署名作者:
Kontorovich, Aryeh; Pinelis, Iosif
署名单位:
Ben-Gurion University of the Negev; Michigan Technological University
刊物名称:
ANNALS OF STATISTICS
ISSN/ISSBN:
0090-5364
DOI:
10.1214/18-AOS1766
发表日期:
2019
页码:
2822-2854
关键词:
摘要:
We provide an exact nonasymptotic lower bound on the minimax expected excess risk (EER) in the agnostic probably-approximately-correct (PAC) machine learning classification model and identify minimax learning algorithms as certain maximally symmetric and minimally randomized voting procedures. Based on this result, an exact asymptotic lower bound on the minimax EER is provided. This bound is of the simple form c(infinity)/root nu as v -> infinity, where c(infinity) = 0.16997... is a universal constant, nu = mid, m is the size of the training sample and d is the Vapnik-Chervonenkis dimension of the hypothesis class. It is shown that the differences between these asymptotic and nonasymptotic bounds, as well as the differences between these two bounds and the maximum EER of any learning algorithms that minimize the empirical risk, are asymptotically negligible, and all these differences are due to ties in the mentioned voting procedures. A few easy to compute nonasymptotic lower bounds on the minimax EER are also obtained, which are shown to be close to the exact asymptotic lower bound c(infinity)/root nu even for rather small values of the ratio nu = m/d. As an application of these results, we substantially improve existing lower bounds on the tail probability of the excess risk. Among the tools used are Bayes estimation and apparently new identities and inequalities for binomial distributions.