-
作者:Birnbaum, Aharon; Johnstone, Iain M.; Nadler, Boaz; Paul, Debashis
作者单位:Hebrew University of Jerusalem; Stanford University; Weizmann Institute of Science; University of California System; University of California Davis
摘要:We study the problem of estimating the leading eigenvectors of a high-dimensional population covariance matrix based on independent Gaussian observations. We establish a lower bound on the minimax risk of estimators under the l(2) loss, in the joint limit as dimension and sample size increase to infinity, under various models of sparsity for the population eigenvectors. The lower bound on the risk points to the existence of different regimes of sparsity of the eigenvectors. We also propose a n...
-
作者:Leeb, Hannes
作者单位:University of Vienna
摘要:We study the conditional distribution of low-dimensional projections from high-dimensional data, where the conditioning is on other low-dimensional projections. To fix ideas, consider a random d-vector Z that has a Lebesgue density and that is standardized so that EZ = 0 and EZZ' = I-d. Moreover, consider two projections defined by unit-vectors alpha and beta, namely a response y = alpha'Z and an explanatory variable x = beta'Z. It has long been known that the conditional mean of y given x is ...
-
作者:Xuanlong Nguyen
作者单位:University of Michigan System; University of Michigan
摘要:This paper studies convergence behavior of latent mixing measures that arise in finite and infinite mixture models, using transportation distances (i.e., Wasserstein metrics). The relationship between Wasserstein distances on the space of mixing measures and f-divergence functionals such as Hellinger and Kullback-Leibler distances on the space of mixture distributions is investigated in detail using various identifiability conditions. Convergence in Wasserstein metrics for discrete measures im...
-
作者:Cai, T. Tony; Ma, Zongming; Wu, Yihong
作者单位:University of Pennsylvania; University of Illinois System; University of Illinois Urbana-Champaign
摘要:Principal component analysis (PCA) is one of the most commonly used statistical procedures with a wide range of applications. This paper considers both minimax and adaptive estimation of the principal subspace in the high dimensional setting. Under mild technical conditions, we first establish the optimal rates of convergence for estimating the principal subspace which are sharp with respect to all the parameters, thus providing a complete characterization of the difficulty of the estimation p...
-
作者:Amini, Arash A.; Chen, Aiyou; Bickel, Peter J.; Levina, Elizaveta
作者单位:University of Michigan System; University of Michigan; Alphabet Inc.; Google Incorporated; University of California System; University of California Berkeley
摘要:Many algorithms have been proposed for fitting network models with communities, but most of them do not scale well to large networks, and often fail on sparse networks. Here we propose a new fast pseudo-likelihood method for fitting the stochastic block model for networks, as well as a variant that allows for an arbitrary degree distribution by conditioning on degrees. We show that the algorithms perform well under a range of settings, including on very sparse networks, and illustrate on the e...
-
作者:Chernozhukov, Victor; Chetverikov, Denis; Kato, Kengo
作者单位:Massachusetts Institute of Technology (MIT); Massachusetts Institute of Technology (MIT); University of California System; University of California Los Angeles; University of Tokyo
摘要:We derive a Gaussian approximation result for the maximum of a sum of high-dimensional random vectors. Specifically, we establish conditions under which the distribution of the maximum is approximated by that of the maximum of a sum of the Gaussian random vectors with the same covariance matrices as the original vectors. This result applies when the dimension of random vectors (p) is large compared to the sample size (n); in fact, p can be much larger than n, without restricting correlations o...
-
作者:Jiang, Tiefeng; Yang, Fan
作者单位:University of Minnesota System; University of Minnesota Twin Cities; University of Minnesota System; University of Minnesota Twin Cities
摘要:For random samples of size n obtained from p-variate normal distributions, we consider the classical likelihood ratio tests (LRT) for their means and covariance matrices in the high-dimensional setting. These test statistics have been extensively studied in multivariate analysis, and their limiting distributions under the null hypothesis were proved to be chi-square distributions as n goes to infinity and p remains fixed. In this paper, we consider the high-dimensional case where both p and n ...
-
作者:Zhang, Rongmao; Peng, Liang; Wang, Ruodu
作者单位:Zhejiang University; University System of Georgia; Georgia Institute of Technology; University of Waterloo
摘要:Testing covariance structure is of importance in many areas of statistical analysis, such as microarray analysis and signal processing. Conventional tests for finite-dimensional covariance cannot be applied to high-dimensional data in general, and tests for high-dimensional covariance in the literature usually depend on some special structure of the matrix. In this paper, we propose some empirical likelihood ratio tests for testing whether a covariance matrix equals a given one or has a banded...
-
作者:Chatterjee, A.; Lahiri, S. N.
作者单位:Indian Statistical Institute; Indian Statistical Institute Delhi; North Carolina State University
摘要:Zou [J. Amer. Statist. Assoc. 101 (2006) 1418-1429] proposed the Adaptive LASSO (ALASSO) method for simultaneous variable selection and estimation of the regression parameters, and established its oracle property. In this paper, we investigate the rate of convergence of the ALASSO estimator to the oracle distribution when the dimension of the regression parameters may grow to infinity with the sample size. It is shown that the rate critically depends on the choices of the penalty parameter and...
-
作者:Li, Chenxu
作者单位:Peking University
摘要:This paper proposes a widely applicable method of approximate maximum-likelihood estimation for multivariate diffusion process from discretely sampled data. A closed-form asymptotic expansion for transition density is proposed and accompanied by an algorithm containing only basic and explicit calculations for delivering any arbitrary order of the expansion. The likelihood function is thus approximated explicitly and employed in statistical estimation. The performance of our method is demonstra...