-
作者:Javanmard, Adel; Montanari, Andrea
作者单位:University of Southern California; Stanford University; Stanford University
摘要:Multiple hypothesis testing is a core problem in statistical inference and arises in almost every scientific field. Given a set of null hypotheses H(n) = (H-1,..., H-n), Benjamini and Hochberg [J.R. Stat. Soc. Ser. B. Stat. Methodol. 57 (1995) 289-300] introduced the false discovery rate (FDR), which is the expected proportion of false positives among rejected null hypotheses, and proposed a testing procedure that controls FDR below a preassigned significance level. Nowadays FDR is the criteri...
-
作者:Chen, Xi; Liu, Weidong
作者单位:New York University; Shanghai Jiao Tong University; Shanghai Jiao Tong University
摘要:Testing independence among a number of (ultra) high-dimensional random samples is a fundamental and challenging problem. By arranging n identically distributed p-dimensional random vectors into a p x n data matrix, we investigate the problem of testing independence among columns under the matrix-variate normal modeling of data. We propose a computationally simple and tuning-free test statistic, characterize its limiting null distribution, analyze the statistical power and prove its minimax opt...
-
作者:Lv, Shaogao; Lin, Huazhen; Lian, Heng; Huang, Jian
作者单位:Nanjing Audit University; Southwestern University of Finance & Economics - China; City University of Hong Kong; University of Iowa
摘要:This paper considers the estimation of the sparse additive quantile regression (SAQR) in high-dimensional settings. Given the nonsmooth nature of the quantile loss function and the nonparametric complexities of the component function estimation, it is challenging to analyze the theoretical properties of ultrahigh-dimensional SAQR. We propose a regularized learning approach with a two-fold Lasso-type regularization in a reproducing kernel Hilbert space (RKHS) for SAQR. We establish nonasymptoti...
-
作者:Xia, Ningning; Zheng, Xinghua
作者单位:Shanghai University of Finance & Economics; Hong Kong University of Science & Technology
摘要:In practice, observations are often contaminated by noise, making the resulting sample covariance matrix a signal-plus-noise sample covariance matrix. Aiming to make inferences about the spectral distribution of the population covariance matrix under such a situation, we establish an asymptotic relationship that describes how the limiting spectral distribution of (signal) sample covariance matrices depends on that of signal-plus-noisetype sample covariance matrices. As an application, we consi...
-
作者:He, Yuanzhen; Cheng, Ching-Shui; Tang, Boxin
作者单位:Beijing Normal University; Academia Sinica - Taiwan; Simon Fraser University
摘要:Strong orthogonal arrays were recently introduced and studied in He and Tang [Biometrika 100 (2013) 254-260] as a class of space-filling designs for computer experiments. To enjoy the benefits of better space-filling properties, when compared to ordinary orthogonal arrays, strong orthogonal arrays need to have strength three or higher, which may require run sizes that are too large for experimenters to afford. To address this problem, we introduce a new class of arrays, called strong orthogona...
-
作者:Rao, Suhasini Subba
作者单位:Texas A&M University System; Texas A&M University College Station
摘要:A class of Fourier based statistics for irregular spaced spatial data is introduced. Examples include the Whittle likelihood, a parametric estimator of the covariance function based on the L-2-contrast function and a simple nonparametric estimator of the spatial autocovariance which is a nonnegative function. The Fourier based statistic is a quadratic form of a discrete Fourier-type transform of the spatial data. Evaluation of the statistic is computationally tractable, requiring O(nb) operati...
-
作者:Tian, Xiaoying; Taylor, Jonathan
作者单位:Stanford University
摘要:Inspired by sample splitting and the reusable holdout introduced in the field of differential privacy, we consider selective inference with a randomized response. We discuss two major advantages of using a randomized response for model selection. First, the selectively valid tests are more powerful after randomized selection. Second, it allows consistent estimation and weak convergence of selective inference procedures. Under independent sampling, we prove a selective (or privatized) central l...
-
作者:Velasco, Carlos; Lobato, Ignacio N.
作者单位:Universidad Carlos III de Madrid; Instituto Tecnologico Autonomo de Mexico
摘要:This article introduces frequency domain minimum distance procedures for performing inference in general, possibly non causal and/or noninvertible, autoregressive moving average (ARMA) models. We use information from higher order moments to achieve identification on the location of the roots of the AR and MA polynomials for non-Gaussian time series. We propose a minimum distance estimator that optimally combines the information contained in second, third, and fourth moments. Contrary to existi...
-
作者:Lecue, Guillaume; Mendelson, Shahar
作者单位:Institut Polytechnique de Paris; ENSAE Paris; Centre National de la Recherche Scientifique (CNRS); Universite Paris Saclay; Technion Israel Institute of Technology; Australian National University; Institut Polytechnique de Paris; ENSAE Paris
摘要:We obtain bounds on estimation error rates for regularization procedures of the form (f) over cap is an element of argmin(f is an element of F)(1/N Sigma(N)(i=1) (Yi - f (X-i))(2) + lambda Psi(f)) when Psi is a norm and F is convex. Our approach gives a common framework that may be used in the analysis of learning problems and regularization problems alike. In particular, it sheds some light on the role various notions of sparsity have in regularization and on their connection with the size of...
-
作者:Qiu, Yumou; Chen, Song Xi; Nettleton, Dan
作者单位:University of Nebraska System; University of Nebraska Lincoln; Peking University; Peking University; Iowa State University
摘要:Motivated by the analysis of RNA sequencing (RNA-seq) data for genes differentially expressed across multiple conditions, we consider detecting rare and faint signals in high-dimensional response variables. We address the signal detection problem under a general framework, which includes generalized linear models for count-valued responses as special cases. We propose a test statistic that carries out a multi-level thresholding on maximum likelihood estimators (MLEs) of the signals, based on a...