-
作者:Wang, Shuaiwen; Weng, Haolei; Maleki, Arian
作者单位:Columbia University; Michigan State University
摘要:We study the problem of variable selection for linear models under the high-dimensional asymptotic setting, where the number of observations n grows at the same rate as the number of predictors p. We consider two-stage variable selection techniques (TVS) in which the first stage uses bridge estimators to obtain an estimate of the regression coefficients, and the second stage simply thresholds this estimate to select the important predictors. The asymptotic false discovery proportion (AFDP) and...
-
作者:Rousseau, Judith; Szabo, Botond
作者单位:University of Oxford; Leiden University - Excl LUMC; Leiden University
摘要:We investigate the frequentist coverage properties of (certain) Bayesian credible sets in a general, adaptive, nonparametric framework. It is well known that the construction of adaptive and honest confidence sets is not possible in general. To overcome this problem (in context of sieve type of priors), we introduce an extra assumption on the functional parameters, the so-called general polished tail condition. We then show that under standard assumptions, both the hierarchical and empirical B...
-
作者:Westling, Ted; Carone, Marco
作者单位:University of Pennsylvania; University of Washington; University of Washington Seattle
摘要:The problem of nonparametric inference on a monotone function has been extensively studied in many particular cases. Estimators considered have often been of so-called Grenander type, being representable as the left derivative of the greatest convex minorant or least concave majorant of an estimator of a primitive function. In this paper, we provide general conditions for consistency and pointwise convergence in distribution of a class of generalized Grenander-type estimators of a monotone fun...
-
作者:Candes, Emmanuel J.; Sur, Pragya
作者单位:Stanford University; Harvard University
摘要:This paper rigorously establishes that the existence of the maximum likelihood estimate (MLE) in high-dimensional logistic regression models with Gaussian covariates undergoes a sharp phase transition. We introduce an explicit boundary curve h(MLE), parameterized by two scalars measuring the overall magnitude of the unknown sequence of regression coefficients, with the following property: in the limit of large sample sizes n and number of features p proportioned in such a way that p/n -> kappa...
-
作者:Ledoit, Olivier; Wolf, Michael
作者单位:University of Zurich
摘要:This paper establishes the first analytical formula for nonlinear shrinkage estimation of large-dimensional covariancematrices. We achieve this by identifying and mathematically exploiting a deep connection between nonlinear shrinkage and nonparametric estimation of the Hilbert transform of the sample spectral density. Previous nonlinear shrinkage methods were of numerical nature: QuEST requires numerical inversion of a complex equation from random matrix theory whereas NERCOME is based on a s...
-
作者:Lopes, Miles E.; Lin, Zhenhua; Mueller, Hans-Georg
作者单位:University of California System; University of California Davis
摘要:In recent years, bootstrap methods have drawn attention for their ability to approximate the laws of max statistics in high-dimensional problems. A leading example of such a statistic is the coordinatewise maximum of a sample average of n random vectors in R-p. Existing results for this statistic show that the bootstrap can work when n << p, and rates of approximation (in Kolmogorov distance) have been obtained with only logarithmic dependence in p. Nevertheless, one of the challenging aspects...
-
作者:Bunea, Florentina; Giraud, Christophe; Luo, Xi; Royer, Martin; Verzelen, Nicolas
作者单位:Cornell University; Centre National de la Recherche Scientifique (CNRS); Universite Paris Saclay; University of Texas System; University of Texas Health Science Center Houston; University of Texas School Public Health; Universite de Montpellier; Institut Agro; Montpellier SupAgro; INRAE
摘要:The problem of variable clustering is that of estimating groups of similar components of a p-dimensional vector X = (X- 1, ..., X- p) from n independent copies of X. There exists a large number of algorithms that return data-dependent groups of variables, but their interpretation is limited to the algorithm that produced them. An alternative is model-based clustering, in which one begins by defining population level clusters relative to a model that embeds notions of similarity. Algorithms tai...
-
作者:He, Yi; Hou, Yanxi; Peng, Liang; Shen, Haipeng
作者单位:University of Amsterdam; Fudan University; University System of Georgia; Georgia State University; University of Hong Kong
摘要:Conditional value-at-risk is a popular risk measure in risk management. We study the inference problem of conditional value-at-risk under a linear predictive regression model. We derive the asymptotic distribution of the least squares estimator for the conditional value-at-risk. Our results relax the model assumptions made in (Oper. Res. 60 (2012) 739-756) and correct their mistake in the asymptotic variance expression. We show that the asymptotic variance depends on the quantile density funct...
-
作者:Shen, Yandi; Gao, Chao; Witten, Daniela; Han, Fang
作者单位:University of Washington; University of Washington Seattle; University of Chicago
摘要:Consider the heteroscedastic nonparametric regression model with random design Y-i = f (X-i) + V-1/2 (X-i)epsilon(i), i = 1, 2, ..., n, with f (.) and V (.) alpha- and beta-Holder smooth, respectively. We show that the minimax rate of estimating V (.) under both local and global squared risks is of the order n( - 8 alpha beta/4 alpha beta+2 alpha+beta )boolean OR n (- 2 beta/2 beta+1), where a boolean OR b := max{a, b} for any two real numbers a, b. This result extends the fixed design rate n(...
-
作者:Gao, Chao; van der Vaart, Aad W.; Zhou, Harrison H.
作者单位:University of Chicago; Leiden University; Leiden University - Excl LUMC; Yale University
摘要:High dimensional statistics deals with the challenge of extracting structured information from complex model settings. Compared with a large number of frequentist methodologies, there are rather few theoretically optimal Bayes methods for high dimensional models. This paper provides a unified approach to both Bayes high dimensional statistics and Bayes nonparametrics in a general framework of structured linear models. With a proposed two-step prior, we prove a general oracle inequality for pos...