-
作者:Vandermeulen, Robert A.; Scott, Clayton D.
作者单位:University of Kaiserslautern; University of Michigan System; University of Michigan
摘要:When estimating finite mixture models, it is common to make assumptions on the mixture components, such as parametric assumptions. In this work, we make no distributional assumptions on the mixture components and instead assume that observations from the mixture model are grouped, such that observations in the same group are known to be drawn from the same mixture component. We precisely characterize the number of observations n per group needed for the mixture model to be identifiable, as a f...
-
作者:Bodnar, Taras; Dette, Holger; Parolya, Nestor
作者单位:Stockholm University; Ruhr University Bochum; Leibniz University Hannover
摘要:In this paper, new tests for the independence of two high-dimensional vectors are investigated. We consider the case where the dimension of the vectors increases with the sample size and propose multivariate analysis of variance-type statistics for the hypothesis of a block diagonal covariance matrix. The asymptotic properties of the new test statistics are investigated under the null hypothesis and the alternative hypothesis using random matrix theory. For this purpose, we study the weak conv...
-
作者:Chetelat, Didier; Wells, Martin T.
作者单位:Universite de Montreal; Polytechnique Montreal; Cornell University
摘要:We study the behavior of a real p-dimensional Wishart random matrix with n degrees of freedom when n, p -> infinity but p/n -> 0. We establish the existence of phase transitions when p grows at the order n((K+1)/(K+3)) for every K is an element of N, and derive expressions for approximating densities between every two phase transitions. To do this, we make use of a novel tool we call the F-conjugate of an absolutely continuous distribution, which is obtained from the Fourier transform of the s...
-
作者:Ramdas, Aaditya K.; Barber, Rina F.; Wainwright, Martin J.; Jordan, Michael, I
作者单位:Carnegie Mellon University; University of Chicago; University of California System; University of California Berkeley
摘要:There is a significant literature on methods for incorporating knowledge into multiple testing procedures so as to improve their power and precision. Some common forms of prior knowledge include (a) beliefs about which hypotheses are null, modeled by nonuniform prior weights; (b) differing importances of hypotheses, modeled by differing penalties for false discoveries; (c) multiple arbitrary partitions of the hypotheses into (possibly overlapping) groups and (d) knowledge of independence, posi...
-
作者:Cape, Joshua; Minh Tang; Priebe, Carey E.
作者单位:Johns Hopkins University
摘要:The singular value matrix decomposition plays a ubiquitous role throughout statistics and related fields. Myriad applications including clustering, classification, and dimensionality reduction involve studying and exploiting the geometric structure of singular values and singular vectors. This paper provides a novel collection of technical and theoretical tools for studying the geometry of singular subspaces using the two-to-infinity norm. Motivated by preliminary deterministic Procrustes anal...
-
作者:Tan, Zhiqiang; Zhang, Cun-Hui
作者单位:Rutgers University System; Rutgers University New Brunswick
摘要:Additive regression provides an extension of linear regression by modeling the signal of a response as a sum of functions of covariates of relatively low complexity. We study penalized estimation in high-dimensional nonparametric additive regression where functional semi-norms are used to induce smoothness of component functions and the empirical L-2 norm is used to induce sparsity. The functional semi-norms can be of Sobolev or bounded variation types and are allowed to be different amongst i...
-
作者:Barber, Rina Foygel; Candes, Emmanuel J.
作者单位:University of Chicago; Stanford University
摘要:This paper develops a framework for testing for associations in a possibly high-dimensional linear model where the number of features/variables may far exceed the number of observational units. In this framework, the observations are split into two groups, where the first group is used to screen for a set of potentially relevant variables, whereas the second is used for inference over this reduced set of variables; we also develop strategies for leveraging information from the first part of th...
-
作者:Han, Qiyang; Wang, Tengyao; Chatterjee, Sabyasachi; Samworth, Richard J.
作者单位:University of Washington; University of Washington Seattle; University of Cambridge; University of Chicago; University of Illinois System; University of Illinois Urbana-Champaign; University of Cambridge
摘要:We study the least squares regression function estimator over the class of real-valued functions on [0, 1](d) that are increasing in each coordinate. For uniformly bounded signals and with a fixed, cubic lattice design, we establish that the estimator achieves the minimax rate of order n(-min{2/(d+2),1/d} ) in the empirical L-2 loss, up to polylogarithmic factors. Further, we prove a sharp oracle inequality, which reveals in particular that when the true regression function is piecewise consta...
-
作者:Boettcher, Bjoern; Keller-Ressel, Martin; Schilling, Rene L.
作者单位:Technische Universitat Dresden
摘要:We introduce two new measures for the dependence of n >= 2 random variables: distance multivariance and total distance multivariance. Both measures are based on the weighted L-2-distance of quantities related to the characteristic functions of the underlying random variables. These extend distance covariance (introduced by Szekely, Rizzo and Bakirov) from pairs of random variables to n-tuplets of random variables. We show that total distance multivariance can be used to detect the independence...