-
作者:Mei, By Tianxing; Wang, Chen; Yao, Jianfeng
作者单位:University of Hong Kong; The Chinese University of Hong Kong, Shenzhen
摘要:We analyze the singular values of a large p x n data matrix Xn = (xn1, . . . ,xnn), where the columns {xnj } are independent p-dimensional vec-tors, possibly with different distributions. Assuming that the covariance ma-trices Enj = Cov(xnj) of the column vectors can be asymptotically simulta-neously diagonalized, with appropriately converging spectra, we establish a limiting spectral distribution (LSD) for the singular values of Xn when both dimensions p and n grow to infinity in comparable m...
-
作者:Butucea, Cristina; Mammen, Enno; Ndaoud, Mohamed; Tsybakov, Alexandre B.
作者单位:Institut Polytechnique de Paris; ENSAE Paris; Ruprecht Karls University Heidelberg; ESSEC Business School
摘要:In the pivotal variable selection problem, we derive the exact nonasymptotic minimax selector over the class of all s-sparse vectors, which is also the Bayes selector with respect to the uniform prior. While this optimal selector is, in general, not realizable in polynomial time, we show that its tractable counterpart (the scan selector) attains the minimax expected Hamming risk to within factor 2, and is also exact minimax with respect to the probability of wrong recovery. As a consequence, w...
-
作者:Basu, Sumanta; Rao, Suhasini Subba
作者单位:Cornell University; Texas A&M University System; Texas A&M University College Station
摘要:We propose NonStGM, a general nonparametric graphical modeling framework, for studying dynamic associations among the components of a nonstationary multivariate time series. It builds on the framework of Gaussian graphical models (GGM) and stationary time series graphical models (StGM) and complements existing works on parametric graphical models based on change point vector autoregressions (VAR). Analogous to StGM, the proposed framework captures conditional noncorrelations (both intertempora...
-
作者:Ding, Jian; Du, Hang
作者单位:Peking University
摘要:For two correlated graphs which are independently sub-sampled from a common Erdos-Renyi graph G(n, p) , we wish to recover their latent vertex matching from the observation of these two graphs without labels. When p = n-alpha +o(1) for alpha E (0 , 1] , we establish a sharp information-theoretic threshold for whether it is possible to correctly match a positive fraction of vertices. Our result sharpens a constant factor in a recent work by Wu, Xu and Yu.
-
作者:Bing, Xin; Wegkamp, Marten
作者单位:University of Toronto; Cornell University; Cornell University
摘要:In high-dimensional classification problems, a commonly used approach is to first project the high-dimensional features into a lower-dimensional space, and base the classification on the resulting lower-dimensional projections. In this paper, we formulate a latent-variable model with a hidden lowdimensional structure to justify this two-step procedure and to guide which projection to choose. We propose a computationally efficient classifier that takes certain principal components (PCs) of the ...
-
作者:Hoffmann, Marc; Trabs, Mathias
作者单位:Universite PSL; Universite Paris-Dauphine; Helmholtz Association; Karlsruhe Institute of Technology
摘要:We consider a space structured population model generated by two-point clouds: a homogeneous Poisson process M with intensity n -> infinity as a model for a parent generation together with a Cox point process N as offspring generation, with conditional intensity given by the convolution of M with a scaled dispersal density sigma(-1)f (center dot /sigma). Based on a realisation of M and N, we study the nonparametric estimation of f and the estimation of the physical scale parameter sigma > 0 si...
-
作者:Chandrasekher, Kabir Aladin; Pananjady, Ashwin; Thrampoulidis, Christos
作者单位:Stanford University; University System of Georgia; Georgia Institute of Technology; University System of Georgia; Georgia Institute of Technology; University System of Georgia; Georgia Institute of Technology; University of British Columbia
摘要:We consider a general class of regression models with normally dis-tributed covariates, and the associated nonconvex problem of fitting these models from data. We develop a general recipe for analyzing the convergence of iterative algorithms for this task from a random initialization. In particular, provided each iteration can be written as the solution to a convex optimization problem satisfying some natural conditions, we leverage Gaussian compari-son theorems to derive a deterministic seque...
-
作者:Butucea, Cristina; Rohde, Angelika; Steinberger, Lukas
作者单位:Institut Polytechnique de Paris; ENSAE Paris; University of Freiburg; University of Vienna
摘要:Local differential privacy has recently received increasing attention from the statistics community as a valuable tool to protect the privacy of individual data owners without the need of a trusted third party. Similar to the classical notion of randomized response, the idea is that data owners randomize their true information locally and only release the perturbed data. Many different protocols for such local perturbation procedures can be designed. In most estimation problems studied in the ...
-
作者:Einmahl, John H. J.; He, Yi
作者单位:Tilburg University; Tilburg University; University of Amsterdam
摘要:We extend extreme value statistics to independent data with possibly very different distributions. In particular, we present novel asymptotic normality results for the Hill estimator, which now estimates the extreme value index of the average distribution. Due to the heterogeneity, the asymptotic variance can be substantially smaller than that in the i.i.d. case. As a special case, we consider a heterogeneous scales model where the asymptotic variance can be calculated explicitly. The primary ...
-
作者:Han, Qiyang; Shen, Yandi
作者单位:Rutgers University System; Rutgers University New Brunswick; University of Chicago
摘要:The Convex Gaussian Min-Max Theorem (CGMT) has emerged as a prominent theoretical tool for analyzing the precise stochastic behavior of various statistical estimators in the so-called high-dimensional proportional regime, where the sample size and the signal dimension are of the same order. However, a well-recognized limitation of the existing CGMT machinery rests in its stringent requirement on the exact Gaussianity of the design matrix, therefore rendering the obtained precise high-dimension...