-
作者:Dobriban, Edgar
作者单位:University of Pennsylvania
摘要:Researchers often have datasets measuring features xij of samples, such as test scores of students. In factor analysis and PCA, these features are thought to be influenced by unobserved factors, such as skills. Can we determine how many components affect the data? This is an important problem, because decisions made here have a large impact on all downstream data analysis. Consequently, many approaches have been developed. Parallel Analysis is a popular permutation method: it randomly scramble...
-
作者:Ghorbani, Behrooz; Mei, Song; Misiakiewicz, Theodor; Montanari, Andrea
作者单位:Stanford University; Stanford University; Stanford University
-
作者:Schmidt-Hieber, Johannes
作者单位:University of Twente
-
作者:Reiss, Markus; Schmidt-Hieber, Johannes
作者单位:Humboldt University of Berlin; University of Twente
摘要:Given data from a Poisson point process with intensity (x, y) bar right arrow n1( f (x) <= y), frequentist properties for the Bayesian reconstruction of the support boundary function f are derived. We mainly study compound Poisson process priors with fixed intensity proving that the posterior contracts with nearly optimal rate for monotone support boundaries and adapts to Holder smooth boundaries. We then derive a limiting shape result for a compound Poisson process prior and a function space ...
-
作者:Bachoc, Francois; Preinerstorfer, David; Steinberger, Lukas
作者单位:Universite de Toulouse; Universite Toulouse III - Paul Sabatier; Universite Libre de Bruxelles; University of Freiburg
摘要:We suggest general methods to construct asymptotically uniformly valid confidence intervals post-model-selection. The constructions are based on principles recently proposed by Berk et al. (Ann. Statist. 41 (2013) 802-837). In particular, the candidate models used can be misspecified, the target of inference is model-specific, and coverage is guaranteed for any data-driven model selection procedure. After developing a general theory, we apply our methods to practically important situations whe...
-
作者:Drton, Mathias; Han, Fang; Shi, Hongjian
作者单位:Technical University of Munich; University of Washington; University of Washington Seattle
摘要:Testing mutual independence for high-dimensional observations is a fundamental statistical challenge. Popular tests based on linear and simple rank correlations are known to be incapable of detecting nonlinear, nonmonotone relationships, calling for methods that can account for such dependences. To address this challenge, we propose a family of tests that are constructed using maxima of pairwise rank correlations that permit consistent assessment of pairwise independence. Built upon a newly de...
-
作者:Mao, Cheng; Pananjady, Ashwin; Wainwright, Martin J.
作者单位:University System of Georgia; Georgia Institute of Technology; University of California System; University of California Berkeley; University of California System; University of California Berkeley
摘要:Many applications, including rank aggregation, crowd-labeling and graphon estimation, can be modeled in terms of a bivariate isotonic matrix with unknown permutations acting on its rows and/or columns. We consider the problem of estimating an unknown matrix in this class, based on noisy observations of (possibly, a subset of) its entries. We design and analyze polynomial-time algorithms that improve upon the state of the art in two distinct metrics, showing, in particular, that minimax optimal...
-
作者:Mendelson, Shahar; Zhivotovskiy, Nikita
作者单位:Australian National University; HSE University (National Research University Higher School of Economics)
摘要:Let X be a centered random vector taking values in R-d and let Sigma = E (X circle times X) be its covariance matrix. We show that if X satisfies an L-4 - L-2 norm equivalence (sometimes referred to as the bounded kurtosis assumption), there is a covariance estimator (Sigma) over cap that exhibits almost the same performance one would expect had X been a Gaussian vector. The procedure also improves the current state-of-the-art regarding high probability bounds in the sub-Gaussian case (sharp r...
-
作者:Zhu, Guangyu; Su, Zhihua
作者单位:University of Rhode Island; State University System of Florida; University of Florida
摘要:Sparse partial least squares (SPLS) is widely used in applied sciences as a method that performs dimension reduction and variable selection simultaneously in linear regression. Several implementations of SPLS have been derived, among which the SPLS proposed in Chun and Keles (J. R. Stat. Soc. Ser. B. Stat. Methodol. 72 (2010) 3-25) is very popular and highly cited. However, for all of these implementations, the theoretical properties of SPLS are largely unknown. In this paper, we propose a new...
-
作者:Ing, Ching-Kang
作者单位:National Tsing Hua University
摘要:We investigate the prediction capability of the orthogonal greedy algorithm (OGA) in high-dimensional regression models with dependent observations. The rates of convergence of the prediction error of OGA are obtained under a variety of sparsity conditions. To prevent OGA from overfitting, we introduce a high-dimensional Akaike's information criterion (HDAIC) to determine the number of OGA iterations. A key contribution of this work is to show that OGA, used in conjunction with HDAIC, can achi...