-
作者:Cape, Joshua; Minh Tang; Priebe, Carey E.
作者单位:Johns Hopkins University
摘要:The singular value matrix decomposition plays a ubiquitous role throughout statistics and related fields. Myriad applications including clustering, classification, and dimensionality reduction involve studying and exploiting the geometric structure of singular values and singular vectors. This paper provides a novel collection of technical and theoretical tools for studying the geometry of singular subspaces using the two-to-infinity norm. Motivated by preliminary deterministic Procrustes anal...
-
作者:Tan, Zhiqiang; Zhang, Cun-Hui
作者单位:Rutgers University System; Rutgers University New Brunswick
摘要:Additive regression provides an extension of linear regression by modeling the signal of a response as a sum of functions of covariates of relatively low complexity. We study penalized estimation in high-dimensional nonparametric additive regression where functional semi-norms are used to induce smoothness of component functions and the empirical L-2 norm is used to induce sparsity. The functional semi-norms can be of Sobolev or bounded variation types and are allowed to be different amongst i...
-
作者:Berrett, Thomas B.; Samworth, Richard J.; Yuan, Ming
作者单位:University of Cambridge; University of Wisconsin System; University of Wisconsin Madison
摘要:Many statistical procedures, including goodness-of-fit tests and methods for independent component analysis, rely critically on the estimation of the entropy of a distribution. In this paper, we seek entropy estimators that are efficient and achieve the local asymptotic minimax lower bound with respect to squared error loss. To this end, we study weighted averages of the estimators originally proposed by Kozachenko and Leonenko [Probl. Inform. Transm. 23 (1987), 95-101], based on the k-nearest...
-
作者:Barber, Rina Foygel; Candes, Emmanuel J.
作者单位:University of Chicago; Stanford University
摘要:This paper develops a framework for testing for associations in a possibly high-dimensional linear model where the number of features/variables may far exceed the number of observational units. In this framework, the observations are split into two groups, where the first group is used to screen for a set of potentially relevant variables, whereas the second is used for inference over this reduced set of variables; we also develop strategies for leveraging information from the first part of th...
-
作者:Han, Qiyang; Wang, Tengyao; Chatterjee, Sabyasachi; Samworth, Richard J.
作者单位:University of Washington; University of Washington Seattle; University of Cambridge; University of Chicago; University of Illinois System; University of Illinois Urbana-Champaign; University of Cambridge
摘要:We study the least squares regression function estimator over the class of real-valued functions on [0, 1](d) that are increasing in each coordinate. For uniformly bounded signals and with a fixed, cubic lattice design, we establish that the estimator achieves the minimax rate of order n(-min{2/(d+2),1/d} ) in the empirical L-2 loss, up to polylogarithmic factors. Further, we prove a sharp oracle inequality, which reveals in particular that when the true regression function is piecewise consta...
-
作者:Song, Yanglei; Fellouris, Georgios
作者单位:University of Illinois System; University of Illinois Urbana-Champaign
摘要:The sequential multiple testing problem is considered under two generalized error metrics. Under the first one, the probability of at least k mistakes, of any kind, is controlled. Under the second, the probabilities of at least k(1) false positives and at least k(2) false negatives are simultaneously controlled. For each formulation, the optimal expected sample size is characterized, to a first-order asymptotic approximation as the error probabilities go to 0, and a novel multiple testing proc...
-
作者:Wu, Yihong; Yang, Pengkun
作者单位:Yale University; University of Illinois System; University of Illinois Urbana-Champaign
摘要:We consider the problem of estimating the support size of a discrete distribution whose minimum nonzero mass is at least 1/k. Under the independent sampling model, we show that the sample complexity, that is, the minimal sample size to achieve an additive error of epsilon k with probability at least 0.1 is within universal constant factors of k/log k log(2) 1/epsilon, which improves the state-of-the-art result of k/epsilon(2) log k in [In Advances in Neural Information Processing Systems (2013...
-
作者:Bao, Zhigang
作者单位:Hong Kong University of Science & Technology
摘要:In this paper, we study a high-dimensional random matrix model from nonparametric statistics called the Kendall rank correlation matrix, which is a natural multivariate extension of the Kendall rank correlation coefficient. We establish the Tracy-Widom law for its largest eigenvalue. It is the first Tracy-Widom law for a nonparametric random matrix model, and also the first Tracy-Widom law for a high-dimensional U-statistic.
-
作者:Wei, Yuting; Wainwright, Martin J.; Guntuboyina, Adityanand
作者单位:University of California System; University of California Berkeley
摘要:We consider a compound testing problem within the Gaussian sequence model in which the null and alternative are specified by a pair of closed, convex cones. Such cone testing problem arises in various applications, including detection of treatment effects, trend detection in econometrics, signal detection in radar processing and shape-constrained inference in nonparametric statistics. We provide a sharp characterization of the GLRT testing radius up to a universal multiplicative constant in te...
-
作者:Boettcher, Bjoern; Keller-Ressel, Martin; Schilling, Rene L.
作者单位:Technische Universitat Dresden
摘要:We introduce two new measures for the dependence of n >= 2 random variables: distance multivariance and total distance multivariance. Both measures are based on the weighted L-2-distance of quantities related to the characteristic functions of the underlying random variables. These extend distance covariance (introduced by Szekely, Rizzo and Bakirov) from pairs of random variables to n-tuplets of random variables. We show that total distance multivariance can be used to detect the independence...