-
作者:Delattre, Sylvain; Roquain, Etienne
作者单位:Sorbonne Universite; Universite Paris Cite; Centre National de la Recherche Scientifique (CNRS); CNRS - National Institute for Mathematical Sciences (INSMI); Universite Paris Cite; Centre National de la Recherche Scientifique (CNRS); CNRS - National Institute for Mathematical Sciences (INSMI); Sorbonne Universite
摘要:The false discovery proportion (FDP) is a convenient way to account for false positives when a large number m of tests are performed simultaneously. Romano and Wolf [Ann. Statist. 35 (2007) 1378-1408] have proposed a general principle that builds FDP controlling procedures from k-family-wise error rate controlling procedures while incorporating dependencies in an appropriate manner; see Korn et al. [J. Statist. Plann. Inference 124 (2004) 379-398]; Romano and Wolf (2007). However, the theoreti...
-
作者:Lepski, Oleg
作者单位:Aix-Marseille Universite
摘要:We address the problem of adaptive minimax estimation in white Gaussian noise models under L-p-loss, 1 <= p <= infinity, on the anisotropic Nikol'skii classes. We present the estimation procedure based on a new data-driven selection scheme from the family of kernel estimators with varying bandwidths. For the proposed estimator we establish so-called L-p-norm oracle inequality and use it for deriving minimax adaptive results. We prove the existence of rate-adaptive estimators and fully characte...
-
作者:Li, Jian; Siegmund, David
作者单位:Stanford University
摘要:This paper compares the higher criticism statistic (Donoho and Jin [Ann. Statist. 32 (2004) 962-994]), a modification of the higher criticism statistic also suggested by Donoho and Jin, and two statistics of the Berk Jones [Z Wahrsch. Verw. Gebiete 47 (1979) 47-59] type. New approximations to the significance levels of the statistics are derived, and their accuracy is studied by simulations. By numerical examples it is shown that over a broad range of sample sizes the Berk Jones statistics hav...
-
作者:Ren, Zhao; Sun, Tingni; Zhang, Cun-Hui; Zhou, Harrison H.
作者单位:Pennsylvania Commonwealth System of Higher Education (PCSHE); University of Pittsburgh; University System of Maryland; University of Maryland College Park; Rutgers University System; Rutgers University New Brunswick; Yale University
摘要:The Gaussian graphical model, a popular paradigm for studying relationship among variables in a wide range of applications, has attracted great attention in recent years. This paper considers a fundamental question: When is it possible to estimate low-dimensional parameters at parametric square-root rate in a large Gaussian graphical model? A novel regression approach is proposed to obtain asymptotically efficient estimation of each entry of a precision matrix under a sparseness condition rela...
-
作者:Klopp, Olga; Pensky, Marianna
作者单位:Universite Paris Saclay; State University System of Florida; University of Central Florida
摘要:The objective of the present paper is to develop a minimax theory for the varying coefficient model in a nonasymptotic setting. We consider a high-dimensional sparse varying coefficient model where only few of the covariates are present and only some of those covariates are time dependent. Our analysis allows the time-dependent covariates to have different degrees of smoothness and to be spatially inhomogeneous. We develop the minimax lower bounds for the quadratic risk and construct an adapti...
-
作者:Cai, T. Tony; Li, Xiaodong
作者单位:University of Pennsylvania
摘要:Community detection, which aims to cluster N nodes in a given graph into r distinct groups based on the observed undirected edges, is an important problem in network data analysis. In this paper, the popular stochastic block model (SBM) is extended to the generalized stochastic block model (GSBM) that allows for adversarial outlier nodes, which are connected with the other nodes in the graph in an arbitrary way. Under this model, we introduce a procedure using convex optimization followed by k...
-
作者:Cheng, Guang; Shang, Zuofeng
作者单位:Purdue University System; Purdue University
摘要:We consider a joint asymptotic framework for studying semi-nonparametric regression models where (finite-dimensional) Euclidean parameters and (infinite-dimensional) functional parameters are both of interest. The class of models in consideration share a partially linear structure and are estimated in two general contexts: (i) quasi-likelihood and (ii) true likelihood. We first show that the Euclidean estimator and (pointwise) functional estimator, which are re-scaled at different rates, joint...
-
作者:Krauthgamer, Robert; Nadler, Boaz; Vilenchik, Dan
作者单位:Weizmann Institute of Science; Ben-Gurion University of the Negev
摘要:Estimating the leading principal components of data, assuming they are sparse, is a central task in modern high-dimensional statistics. Many algorithms were developed for this sparse PCA problem, from simple diagonal thresholding to sophisticated semidefinite programming (SDP) methods. A key theoretical question is under what conditions can such algorithms recover the sparse principal components? We study this question for a single-spike model with an l(0)-sparse eigenvector, in the asymptotic...
-
作者:Jentsch, Carsten; Plitis, Dimitris N.
作者单位:University of Mannheim; University of California System; University of California San Diego
摘要:Multivariate time series present many challenges, especially when they are high dimensional. The paper's focus is twofold. First, we address the subject of consistently estimating the autocovariance sequence; this is a sequence of matrices that we conveniently stack into one huge matrix. We are then able to show consistency of an estimator based on the so-called flat-top tapers; most importantly, the consistency holds true even when the time series dimension is allowed to increase with the sam...
-
作者:Ma, Zongming; Wu, Yihong
作者单位:University of Pennsylvania; University of Illinois System; University of Illinois Urbana-Champaign
摘要:This paper studies the minimax detection of a small submatrix of elevated mean in a large matrix contaminated by additive Gaussian noise. To investigate the tradeoff between statistical performance and computational cost from a complexity-theoretic perspective, we consider a sequence of discretized models which are asymptotically equivalent to the Gaussian model. Under the hypothesis that the planted clique detection problem cannot be solved in randomized polynomial time when the clique size i...