-
作者:Mendelson, Shahar
作者单位:Technion Israel Institute of Technology
摘要:We show that if F is a convex class of functions that is L-sub-Gaussian, the error rate of learning problems generated by independent noise is equivalent to a fixed point determined by local covering estimates of the class (i.e., the covering number at a specific level), rather than by the Gaussian average, which takes into account the structure of F at an arbitrarily small scale. To that end, we establish new sharp upper and lower estimates on the error rate in such learning problems.
-
作者:Li, Jun; Zhong, Ping-Shou
作者单位:University System of Ohio; Kent State University; Kent State University Salem; Kent State University Kent; Michigan State University
摘要:The paper considers the problem of recovering the sparse different components between two high-dimensional means of column-wise dependent random vectors. We show that dependence can be utilized to lower the identification boundary for signal recovery. Moreover, an optimal convergence rate for the marginal false nondiscovery rate (mFNR) is established under dependence. The convergence rate is faster than the optimal rate without dependence. To recover the sparse signal bearing dimensions, we pr...
-
作者:Cheng, Dan; Chwartzman, Armin S.
作者单位:Texas Tech University System; Texas Tech University; University of California System; University of California San Diego
摘要:A topological multiple testing scheme is presented for detecting peaks in images under stationary ergodic Gaussian noise, where tests are performed at local maxima of the smoothed observed signals. The procedure generalizes the one-dimensional scheme of Schwartzman, Gavrilov and Adler [Ann. Statist. 39 (2011) 3290-3319] to Euclidean domains of arbitrary dimension. Two methods are developed according to two different ways of computing p-values: (i) using the exact distribution of the height of ...
-
作者:Chernozhukov, Victor; Hansen, Christian; Liao, Yuan
作者单位:Massachusetts Institute of Technology (MIT); University of Chicago; University System of Maryland; University of Maryland College Park
摘要:Common high-dimensional methods for prediction rely on having either a sparse signal model, a model in which most parameters are zero and there are a small number of nonzero parameters that are large in magnitude, or a dense signal model, a model with no large parameters and very many small nonzero parameters. We consider a generalization of these two basic models, termed here a sparse + dense model, in which the signal is given by the sum of a sparse signal and a dense signal. Such a structur...
-
作者:Khare, Kshitij; Pal, Subhadip; Su, Zhihua
作者单位:State University System of Florida; University of Florida
摘要:The envelope model is a new paradigm to address estimation and prediction in multivariate analysis. Using sufficient dimension reduction techniques, it has the potential to achieve substantial efficiency gains compared to standard models. This model was first introduced by [Statist. Sinica 20 (2010) 927-960] for multivariate linear regression, and has since been adapted to many other contexts. However, a Bayesian approach for analyzing envelope models has not yet been investigated in the liter...
-
作者:Hu, Rui; Wiens, Douglas P.
作者单位:MacEwan University; University of Alberta
摘要:To aid in the discrimination between two, possibly nonlinear, regression models, we study the construction of experimental designs. Considering that each of these two models might be only approximately specified, robust maximin designs are proposed. The rough idea is as follows. We impose neighbourhood structures on each regression response, to describe the uncertainty in the specifications of the true underlying models. We determine the least favourable-in terms of Kullback-Leibler divergence...
-
作者:Johndrow, James E.; Bhattacharya, Anirban; Dunson, David B.
作者单位:Duke University; Texas A&M University System; Texas A&M University College Station
摘要:Contingency table analysis routinely relies on log-linear models, with latent structure analysis providing a common alternative. Latent structure models lead to a reduced rank tensor factorization of the probability mass function for multivariate categorical data, while log-linear models achieve dimensionality reduction through sparsity. Little is known about the relationship between these notions of dimensionality reduction in the two paradigms. We derive several results relating the support ...
-
作者:Choi, Yunjin; Taylor, Jonathan; Tibshirani, Robert
作者单位:National University of Singapore; Stanford University; Stanford University
摘要:Principal component analysis (PCA) is a well-known tool in multivariate statistics. One significant challenge in using PCA is the choice of the number of principal components. In order to address this challenge, we propose distribution-based methods with exact type 1 error controls for hypothesis testing and construction of confidence intervals for signals in a noisy matrix with finite samples. Assuming Gaussian noise, we derive exact type 1 error controls based on the conditional distribution...
-
作者:Constantinou, Panayiota; Dawid, A. Philip
作者单位:University of Warwick; University of Cambridge; University of Cambridge
摘要:The goal of this paper is to integrate the notions of stochastic conditional independence and variation conditional independence under a more general notion of extended conditional independence. We show that under appropriate assumptions the calculus that applies for the two cases separately (axioms of a separoid) still applies for the extended case. These results provide a rigorous basis for a wide range of statistical concepts, including ancillarity and sufficiency, and, in particular, the D...
-
作者:Feller, Chrystel; Schorning, Kirsten; Dette, Holger; Bermann, Georgina; Bornkamp, Bjoern
作者单位:Novartis; Ruhr University Bochum
摘要:A common problem in Phase II clinical trials is the comparison of dose response curves corresponding to different treatment groups. If the effect of the dose level is described by parametric regression models and the treatments differ in the administration frequency (but not in the sort of drug), a reasonable assumption is that the regression models for the different treatments share common parameters. This paper develops optimal design theory for the comparison of different regression models ...