-
作者:Butucea, Cristina; Meister, Alexander; Rohde, Angelika
作者单位:Institut Polytechnique de Paris; ENSAE Paris; University of Rostock; University of Freiburg
摘要:We consider a general class of statistical experiments, in which an ndimensional centered Gaussian random variable is observed and its covariance matrix is the parameter of interest. The covariance matrix is assumed to be well-approximable in a linear space of lower dimension Kn with eigenvalues uniformly bounded away from zero and infinity. We prove asymptotic equivalence of this experiment and a class of Kn-dimensional Gaussian models with informative expectation in Le Cam's sense when n ten...
-
作者:Wen, Kaiyue; Wang, Tengyao; Wang, Yuhao
作者单位:Stanford University; University of London; London School Economics & Political Science; Tsinghua University
摘要:We consider the problem of testing whether a single coefficient is equal to zero in linear models when the dimension of covariates p can be up to a constant fraction of sample size n. In this regime, an important topic is to propose tests with finite-sample valid size control without requiring the noise to follow strong distributional assumptions. In this paper, we propose a new method, called the residual permutation test (RPT), which is constructed by projecting the regression residuals onto...
-
作者:Auddy, Arnab; Yuan, Ming
作者单位:University System of Ohio; Ohio State University; Columbia University
摘要:In this paper, we investigate the optimal statistical performance and the impact of computational constraints for independent component analysis (ICA). Our goal is twofold. On the one hand, we characterize the precise role of dimensionality on sample complexity and statistical accuracy, and how computational consideration may affect them. In particular, we show that the optimal sample complexity is linear in dimensionality, and interestingly, the commonly used sample kurtosis-based approaches ...
-
作者:Moitra, Ankur; Wein, Alexander S.
作者单位:Massachusetts Institute of Technology (MIT); Massachusetts Institute of Technology (MIT); University of California System; University of California Davis
摘要:We revisit the fundamental question of simple-versus-simple hypothesis testing with an eye toward computational complexity, as the statistically optimal likelihood ratio test is often computationally intractable in highdimensional settings. In the classical spiked Wigner model with a general i.i.d. spike prior, we show (conditional on a conjecture) that an existing test based on linear spectral statistics achieves the best possible trade-off curve between type-I and type-II error rates among a...
-
作者:Montanari, Andrea; Ruan, Feng; Sohn, Youngtak; Yan, Jun
作者单位:Stanford University; Stanford University; Northwestern University; Massachusetts Institute of Technology (MIT)
摘要:Modern machine learning classifiers often exhibit vanishing classification error on the training set. They achieve this by learning nonlinear representations of the inputs that map the data into linearly separable classes. Motivated by these phenomena, we revisit high-dimensional maximum margin classification for linearly separable data. We consider a stylized setting in which data (y(i),x(i)), i <= n are i.i.d. with x(i)similar to N(0,Sigma)xi similar to N(0,Sigma) a p-dimensional Gaussian fe...
-
作者:Zhao, Junlong; Liu, Xiumin; Du, Bin; Liu, Yufeng
作者单位:Beijing Normal University; Beijing Technology & Business University; University of North Carolina; University of North Carolina Chapel Hill; University of North Carolina School of Medicine
摘要:Converting a continuous variable into a discrete one is a commonly used technique for solving various problems in both statistics and machine learning. It is well known that discretizations result in biases. However, this issue has not been studied systematically. In this paper, a general framework is proposed to understand and compare the approximation errors of different slicing strategies. Poincar & eacute;-type inequalities are first established for univariate discretizations and then gene...
-
作者:Jiao, Yuling; Kang, Lican; Liu, Jin; Liu, Xiliang; Yang, Jerry zhijian
作者单位:Wuhan University; Wuhan University; The Chinese University of Hong Kong, Shenzhen; Wuhan University; Wuhan University; Wuhan University
摘要:In this paper, we consider deep approximate policy iteration (DAPI) with the Bellman residual minimization in reinforcement learning. In each iteration of DAPI, we apply convolutional neural networks (CNNs) with ReLU activation, called ReLU CNNs, to estimate the fixed point of the Bellman equation by minimizing an unbiased minimax loss. To bound the estimation error in each iteration, we control the statistical and approximation errors using the tools of the empirical process theory with depen...
-
作者:Fan, Yingying; Gao, Lan; Lv, Jinchi
作者单位:University of Southern California; University of Tennessee System; University of Tennessee Knoxville
摘要:We investigate the robustness of the model-X knockoffs framework with respect to the misspecified or estimated feature distribution. We achieve such a goal by theoretically studying the feature selection performance of a practically implemented knockoffs algorithm, which we name as the approximate knockoffs (ARK) procedure, under the measures of the false discovery rate (FDR) and k-familywise error rate (k-FWER). The approximate knockoffs procedure differs from the model-X knockoffs procedure ...
-
作者:Shi, Lei; Wang, Jingshen; Ding, Peng
作者单位:University of California System; University of California Berkeley; University of California System; University of California Berkeley
摘要:Ever since the seminal work of R. A. Fisher and F. Yates, factorial designs have been an important experimental tool to simultaneously estimate the effects of multiple treatment factors. In factorial designs, the number of treatment combinations grows exponentially with the number of treatment factors, which motivates the forward selection strategy based on the sparsity, hierarchy and heredity principles for factorial effects. Although this strategy is intuitive and has been widely used in pra...
-
作者:Gao, Zhe; Wang, Roulin; Wang, Xueqin; Zhang, Heping
作者单位:Chinese Academy of Sciences; University of Science & Technology of China, CAS; East China Normal University; Yale University
摘要:The exploration of associations between random objects with complex geometric structures has catalyzed the development of various novel statistical tests encompassing distance-based and kernel-based statistics. These methods have various strengths and limitations. One problem is that their test statistics tend to converge to asymptotic null distributions involving secondorder Wiener chaos, which are hard to compute and need approximation or permutation techniques that use much computing power ...