您的位置: 首页 > 全球经管学术 > 顶刊追踪 > 顶尖期刊 > 概率 > Probability Theory and Related Fields > 2019 > 1-2期

The likelihood ratio test in high-dimensional logistic regression is asymptotically a rescaled Chi-square

成果类型：

Article

署名作者：

Sur, Pragya; Chen, Yuxin; Candes, Emmanuel J.

署名单位：

Stanford University; Princeton University; Stanford University

刊物名称：

PROBABILITY THEORY AND RELATED FIELDS

ISSN/ISSBN：

0178-8051

DOI：

10.1007/s00440-018-00896-9

发表日期：

2019

页码：

487-558

关键词：

robust regression bartlett correction phase-transitions M-ESTIMATORS parameters BEHAVIOR statistics models number p2/n

摘要：

Logistic regression is used thousands of times a day to fit data, predict future outcomes, and assess the statistical significance of explanatory variables. When used for the purpose of statistical inference, logistic models produce p-values for the regression coefficients by using an approximation to the distribution of the likelihood-ratio test (LRT). Indeed, Wilks' theorem asserts that whenever we have a fixed number p of variables, twice the log-likelihood ratio (LLR) 2 Lambda is distributed as a chi(2)(k) variable in the limit of large sample sizes n; here, chi(2)(k) is a Chi-square with k degrees of freedom and k the number of variables being tested. In this paper, we prove that when p is not negligible compared to n, Wilks' theorem does not hold and that the Chi-square approximation is grossly incorrect; in fact, this approximation produces p-values that are far too small (under the null hypothesis). Assume that n and p grow large in such a way that p/n -> kappa for some constant kappa < 1/2. (For kappa > 1/2, 2 Lambda ->(P) 0 so that the LRT is not interesting in this regime.) We prove that for a class of logistic models, the LLR converges to a rescaled Chi-square, namely, 2 Lambda ->(d) alpha(kappa)chi(2)(k), where the scaling factor alpha(kappa) is greater than one as soon as the dimensionality ratio kappa is positive. Hence, the LLR is larger than classically assumed. For instance, when kappa = 0.3, alpha(kappa) approximate to 1.5. In general, we show how to compute the scaling factor by solving a nonlinear system of two equations with two unknowns. Our mathematical arguments are involved and use techniques from approximate message passing theory, from non-asymptotic random matrix theory and from convex geometry. We also complement our mathematical study by showing that the new limiting distribution is accurate for finite sample sizes. Finally, all the results from this paper extend to some other regression models such as the probit regression model.