-
作者:Zhang, Anru; Brown, Lawrence D.; Cai, T. Tony
作者单位:University of Wisconsin System; University of Wisconsin Madison; University of Pennsylvania
摘要:We propose a general semi-supervised inference framework focused on the estimation of the population mean. As usual in semi-supervised settings, there exists an unlabeled sample of covariate vectors and a labeled sample consisting of covariate vectors along with real-valued responses (labels). Otherwise, the formulation is assumption-lean in that no major conditions are imposed on the statistical or functional form of the data. We consider both the ideal semi-supervised setting where infinitel...
-
作者:Bao, Zhigang; Hu, Jiang; Pan, Guangming; Zhou, Wang
作者单位:Hong Kong University of Science & Technology; Northeast Normal University - China; Nanyang Technological University; National University of Singapore
摘要:Consider a Gaussian vector z = (x', y')', consisting of two sub-vectors x and y with dimensions p and q, respectively. With n independent observations of z, we study the correlation between x and y, from the perspective of the canonical correlation analysis. We investigate the high-dimensional case: both p and q are proportional to the sample size n. Denote by Sigma(uv) the population cross-covariance matrix of random vectors u and v, and denote by Suv the sample counterpart. The canonical cor...
-
作者:Eltzner, Benjamin; Huckemann, Stephan F.
作者单位:University of Gottingen
摘要:The (CLT) central limit theorems for generalized Frechet means (data descriptors assuming values in manifolds, such as intrinsic means, geodesics, etc.) on manifolds from the literature are only valid if a certain empirical process of Hessians of the Frechet function converges suitably, as in the proof of the prototypical BP-CLT [Ann. Statist. 33 (2005) 1225-1259]. This is not valid in many realistic scenarios and we provide for a new very general CLT. In particular, this includes scenarios wh...
-
作者:Heckel, Reinhard; Shah, Nihar B.; Ramchandran, Kannan; Wainwright, Martin J.
作者单位:Rice University; Carnegie Mellon University; Carnegie Mellon University; University of California System; University of California Berkeley; University of California System; University of California Berkeley
摘要:We consider sequential or active ranking of a set of n items based on noisy pairwise comparisons. Items are ranked according to the probability that a given item beats a randomly chosen item, and ranking refers to partitioning the items into sets of prespecified sizes according to their scores. This notion of ranking includes as special cases the identification of the top-k items and the total ordering of the items. We first analyze a sequential ranking algorithm that counts the number of comp...
-
作者:Zheng, Shurong; Cheng, Guanghui; Guo, Jianhua; Zhu, Hongtu
作者单位:Northeast Normal University - China; Northeast Normal University - China; Guangzhou University; University of North Carolina; University of North Carolina Chapel Hill; University of North Carolina School of Medicine
摘要:Testing correlation structures has attracted extensive attention in the literature due to both its importance in real applications and several major theoretical challenges. The aim of this paper is to develop a general framework of testing correlation structures for the one , two and multiple sample testing problems under a high-dimensional setting when both the sample size and data dimension go to infinity. Our test statistics are designed to deal with both the dense and sparse alternatives. ...
-
作者:Sun, Fasheng; Wang, Yaping; Xu, Hongquan
作者单位:Northeast Normal University - China; Northeast Normal University - China; East China Normal University; University of California System; University of California Los Angeles
摘要:Efficient designs are in high demand in practice for both computer and physical experiments. Existing designs (such as maximin distance designs and uniform designs) may have bad low-dimensional projections, which is undesirable when only a few factors are active. We propose a new design criterion, called uniform projection criterion, by focusing on projection uniformity. Uniform projection designs generated under the new criterion scatter points uniformly in all dimensions and have good space-...
-
作者:Lee, Kyoungjae; Lee, Jaeyong; Lin, Lizhen
作者单位:University of Notre Dame; University of Notre Dame; Seoul National University (SNU); Inha University
摘要:In this paper we study the high-dimensional sparse directed acyclic graph (DAG) models under the empirical sparse Cholesky prior. Among our results, strong model selection consistency or graph selection consistency is obtained under more general conditions than those in the existing literature. Compared to Cao, Khare and Ghosh [Ann. Statist. (2019) 47 319-348], the required conditions are weakened in terms of the dimensionality, sparsity and lower bound of the nonzero elements in the Cholesky ...
-
作者:Vandermeulen, Robert A.; Scott, Clayton D.
作者单位:University of Kaiserslautern; University of Michigan System; University of Michigan
摘要:When estimating finite mixture models, it is common to make assumptions on the mixture components, such as parametric assumptions. In this work, we make no distributional assumptions on the mixture components and instead assume that observations from the mixture model are grouped, such that observations in the same group are known to be drawn from the same mixture component. We precisely characterize the number of observations n per group needed for the mixture model to be identifiable, as a f...
-
作者:Das, Debraj; Gregory, Karl; Lahiri, S. N.
作者单位:University of Wisconsin System; University of Wisconsin Madison; University of South Carolina System; University of South Carolina Columbia; North Carolina State University
摘要:The Adaptive Lasso (Alasso) was proposed by Zou [J. Amer. Statist. Assoc. 101 (2006) 1418-1429] as a modification of the Lasso for the purpose of simultaneous variable selection and estimation of the parameters in a linear regression model. Zou [J. Amer. Statist. Assoc. 101 (2006) 1418-1429] established that the Alasso estimator is variable-selection consistent as well as asymptotically Normal in the indices corresponding to the nonzero regression coefficients in certain fixed-dimensional sett...
-
作者:Lugosi, Gabor; Mendelson, Shahar
作者单位:ICREA; Pompeu Fabra University; Barcelona School of Economics; Technion Israel Institute of Technology; Australian National University
摘要:We study the problem of estimating the mean of a random vector X given a sample of N independent, identically distributed points. We introduce a new estimator that achieves a purely sub-Gaussian performance under the only condition that the second moment of X exists. The estimator is based on a novel concept of a multivariate median.