-
作者:Fan, Jianqing; Liao, Yuan
作者单位:Princeton University; University System of Maryland; University of Maryland College Park
摘要:Most papers on high-dimensional statistics are based on the assumption that none of the regressors are correlated with the regression error, namely, they are exogenous. Yet, endogeneity can arise incidentally from a large pool of regressors in a high-dimensional regression. This causes the inconsistency of the penalized least-squares method and possible false scientific discoveries. A necessary condition for model selection consistency of a general class of penalized regression methods is give...
-
作者:Jacod, Jean; Todorov, Viktor
作者单位:Universite Paris Cite; Centre National de la Recherche Scientifique (CNRS); CNRS - National Institute for Mathematical Sciences (INSMI); Sorbonne Universite; Northwestern University
摘要:We propose new nonparametric estimators of the integrated volatility of an Ito semimartingale observed at discrete times on a fixed time interval with mesh of the observation grid shrinking to zero. The proposed estimators achieve the optimal rate and variance of estimating integrated volatility even in the presence of infinite variation jumps when the latter are stochastic integrals with respect to locally stable Levy processes, that is, processes whose Levy measure around zero behaves like t...
-
作者:Belloni, Alexandre; Chernozhukov, Victor; Wang, Lie
作者单位:Duke University; Massachusetts Institute of Technology (MIT); Massachusetts Institute of Technology (MIT)
摘要:We propose a self-tuning root Lasso method that simultaneously resolves three important practical problems in high-dimensional regression analysis, namely it handles the unknown scale, heteroscedasticity and (drastic) non-Gaussianity of the noise. In addition, our analysis allows for badly behaved designs, for example, perfectly collinear regressors, and generates sharp bounds even in extreme cases, such as the infinite variance case and the noiseless case, in contrast to Lasso. We establish v...
-
作者:Cholaquidis, Alejandro; Cuevas, Antonio; Fraiman, Ricardo
作者单位:Universidad de la Republica, Uruguay; Autonomous University of Madrid
摘要:A domain S subset of R-d is said to fulfill the Poincare cone property if any point in the boundary of S is the vertex of a (finite) cone which does not otherwise intersects the closure (S) over bar. For more than a century, this condition has played a relevant role in the theory of partial differential equations, as a shape assumption aimed to ensure the existence of a solution for the classical Dirichlet problem on S. In a completely different setting, this paper is devoted to analyze some s...
-
作者:Lepski, Oleg; Serdyukova, Nora
作者单位:Aix-Marseille Universite; Universidad de Concepcion
摘要:The problem of adaptive multivariate function estimation in the single-index regression model with random design and weak assumptions on the noise is investigated. A novel estimation procedure that adapts simultaneously to the unknown index vector and the smoothness of the link function by selecting from a family of specific kernel estimators is proposed. We establish a pointwise oracle inequality which, in its turn, is used to judge the quality of estimating the entire function (global oracle...
-
作者:Vu, Vincent Q.; Lei, Jing
作者单位:University System of Ohio; Ohio State University; Carnegie Mellon University
摘要:We study sparse principal components analysis in high dimensions, where p (the number of variables) can be much larger than n (the number of observations), and analyze the problem of estimating the subspace spanned by the principal eigenvectors of the population covariance matrix. We introduce two complementary notions of eq subspace sparsity: row sparsity and column sparsity. We prove nonasymptotic lower and upper bounds on the minimax subspace estimation error for 0 <= q <= I. The bounds are...
-
作者:Dette, Holger; Melas, Viatcheslav B.; Shpilev, Petr
作者单位:Ruhr University Bochum; Saint Petersburg State University
摘要:This paper considers the problem of constructing optimal discriminating experimental designs for competing regression models on the basis of the T-optimality criterion introduced by Atkinson and Fedorov [Biometrika 62 (1975a) 57-70]. T-optimal designs depend on unknown model parameters and it is demonstrated that these designs are sensitive with respect to misspecification. As a solution to this problem we propose a Bayesian and standardized maximin approach to construct robust and efficient d...
-
作者:Zhang, Li
作者单位:Microsoft
摘要:We present estimators for a well studied statistical estimation problem: the estimation for the linear regression model with soft sparsity constraints (lq constraint with 0 <= 1) in the high-dimensional setting. We first present a family of estimators, called the projected nearest neighbor estimator and show, by using results from Convex Geometry, that such estimator is within a logarithmic factor of the optimal for any design matrix. Then by utilizing a semi-definite programming relaxation te...
-
作者:Lee, Kuang-Yao; Li, Bing; Chiaromonte, Francesca
作者单位:Yale University; Pennsylvania Commonwealth System of Higher Education (PCSHE); Pennsylvania State University; Pennsylvania State University - University Park
摘要:In this paper we introduce a general theory for nonlinear sufficient dimension reduction, and explore its ramifications and scope. This theory subsumes recent work employing reproducing kernel Hilbert spaces, and reveals many parallels between linear and nonlinear sufficient dimension reduction. Using these parallels we analyze the properties of existing methods and develop new ones. We begin by characterizing dimension reduction at the general level of sigma-fields and proceed to that of clas...
-
作者:Woodard, Dawn B.; Rosenthal, Jeffrey S.
作者单位:Cornell University; Cornell University; University of Toronto
摘要:We analyze the convergence rate of a simplified version of a popular Gibbs sampling method used for statistical discovery of gene regulatory binding motifs in DNA sequences. This sampler satisfies a very strong form of ergodicity (uniform). However, we show that, due to multimodality of the posterior distribution, the rate of convergence often decreases exponentially as a function of the length of the DNA sequence. Specifically, we show that this occurs whenever there is more than one true rep...