-
作者:Tang, Chuan-Fa; Wang, Dewei; Tebbs, Joshua M.
作者单位:University of South Carolina System; University of South Carolina Columbia
摘要:We propose L-p distance-based goodness-of-fit (GOF) tests for uniform stochastic ordering with two continuous distributions F and G, both of which are unknown. Our tests are motivated by the fact that when F and G are uniformly stochastically ordered, the ordinal dominance curve R = FG(-1) is star-shaped. We derive asymptotic distributions and prove that our testing procedure has a unique least favorable configuration of F and G for p is an element of [1, infinity]. We use simulation to assess...
-
作者:Cai, T. Tony; Guo, Zijian
作者单位:University of Pennsylvania
摘要:Confidence sets play a fundamental role in statistical inference. In this paper, we consider confidence intervals for high-dimensional linear regression with random design. We first establish the convergence rates of the minimax expected length for confidence intervals in the oracle setting where the sparsity parameter is given. The focus is then on the problem of adaptation to sparsity for the construction of confidence intervals. Ideally, an adaptive confidence interval should have its lengt...
-
作者:Dobriban, Edgar
作者单位:Stanford University
摘要:Principal component analysis (PCA) is a widely used method for dimension reduction. In high-dimensional data, the signal eigenvalues corresponding to weak principal components (PCs) do not necessarily separate from the bulk of the noise eigenvalues. Therefore, popular tests based on the largest eigenvalue have little power to detect weak PCs. In the special case of the spiked model, certain tests asymptotically equivalent to linear spectral statistics (LSS)-averaging effects over all eigenvalu...
-
作者:Klopp, Olga; Tsybakov, Alexandre B.; Verzelen, Nicolas
作者单位:Universite Paris Saclay; Centre National de la Recherche Scientifique (CNRS); CNRS - Institute for Humanities & Social Sciences (INSHS); Institut Polytechnique de Paris; ENSAE Paris; INRAE
摘要:Inhomogeneous random graph models encompass many network models such as stochastic block models and latent position models. We consider the problem of statistical estimation of the matrix of connection probabilities based on the observations of the adjacency matrix of the network. Taking the stochastic block model as an approximation, we construct estimators of network connection probabilities the ordinary block constant least squares estimator, and its restricted version. We show that they sa...
-
作者:Loh, Po-Ling; Wainwright, Martin J.
作者单位:University of Wisconsin System; University of Wisconsin Madison; University of Wisconsin System; University of Wisconsin Madison; University of California System; University of California Berkeley; University of California System; University of California Berkeley
摘要:We develop a new primal-dual witness proof framework that may be used to establish variable selection consistency and l(infinity)-bounds for sparse regression problems, even when the loss function and regularizer are nonconvex. We use this method to prove two theorems concerning support recovery and l(infinity)-guarantees for a regression estimator in a general setting. Notably, our theory applies to all potential stationary points of the objective and certifies that the stationary point is un...
-
作者:Mendelson, Shahar
作者单位:Technion Israel Institute of Technology
摘要:We show that if F is a convex class of functions that is L-sub-Gaussian, the error rate of learning problems generated by independent noise is equivalent to a fixed point determined by local covering estimates of the class (i.e., the covering number at a specific level), rather than by the Gaussian average, which takes into account the structure of F at an arbitrarily small scale. To that end, we establish new sharp upper and lower estimates on the error rate in such learning problems.
-
作者:Li, Jun; Zhong, Ping-Shou
作者单位:University System of Ohio; Kent State University; Kent State University Salem; Kent State University Kent; Michigan State University
摘要:The paper considers the problem of recovering the sparse different components between two high-dimensional means of column-wise dependent random vectors. We show that dependence can be utilized to lower the identification boundary for signal recovery. Moreover, an optimal convergence rate for the marginal false nondiscovery rate (mFNR) is established under dependence. The convergence rate is faster than the optimal rate without dependence. To recover the sparse signal bearing dimensions, we pr...
-
作者:Cheng, Dan; Chwartzman, Armin S.
作者单位:Texas Tech University System; Texas Tech University; University of California System; University of California San Diego
摘要:A topological multiple testing scheme is presented for detecting peaks in images under stationary ergodic Gaussian noise, where tests are performed at local maxima of the smoothed observed signals. The procedure generalizes the one-dimensional scheme of Schwartzman, Gavrilov and Adler [Ann. Statist. 39 (2011) 3290-3319] to Euclidean domains of arbitrary dimension. Two methods are developed according to two different ways of computing p-values: (i) using the exact distribution of the height of ...
-
作者:Chernozhukov, Victor; Hansen, Christian; Liao, Yuan
作者单位:Massachusetts Institute of Technology (MIT); University of Chicago; University System of Maryland; University of Maryland College Park
摘要:Common high-dimensional methods for prediction rely on having either a sparse signal model, a model in which most parameters are zero and there are a small number of nonzero parameters that are large in magnitude, or a dense signal model, a model with no large parameters and very many small nonzero parameters. We consider a generalization of these two basic models, termed here a sparse + dense model, in which the signal is given by the sum of a sparse signal and a dense signal. Such a structur...
-
作者:Khare, Kshitij; Pal, Subhadip; Su, Zhihua
作者单位:State University System of Florida; University of Florida
摘要:The envelope model is a new paradigm to address estimation and prediction in multivariate analysis. Using sufficient dimension reduction techniques, it has the potential to achieve substantial efficiency gains compared to standard models. This model was first introduced by [Statist. Sinica 20 (2010) 927-960] for multivariate linear regression, and has since been adapted to many other contexts. However, a Bayesian approach for analyzing envelope models has not yet been investigated in the liter...