-
作者:Bien, Jacob; Taylor, Jonathan; Tibshirani, Robert
作者单位:Cornell University; Cornell University; Stanford University; Stanford University
摘要:We add a set of convex constraints to the lasso to produce sparse interaction models that honor the hierarchy restriction that an interaction only be included in a model if one or both variables are marginally important. We give a precise characterization of the effect of this hierarchy constraint, prove that hierarchy holds with probability one and derive an unbiased estimate for the degrees of freedom of our estimator. A bound on this estimate reveals the amount of fitting saved by the hiera...
-
作者:Zhang, Xianyang; Shao, Xiaofeng
作者单位:University of Missouri System; University of Missouri Columbia; University of Illinois System; University of Illinois Urbana-Champaign
摘要:In this paper, we derive higher order Edgeworth expansions for the finite sample distributions of the subsampling-based t-statistic and the Wald statistic in the Gaussian location model under the so-called fixed-smoothing paradigm. In particular, we show that the error of asymptotic approximation is at the order of the reciprocal of the sample size and obtain explicit forms for the leading error terms in the expansions. The results are used to justify the second-order correctness of a new boot...
-
作者:Wang, Lan; Kim, Yongdai; Li, Runze
作者单位:University of Minnesota System; University of Minnesota Twin Cities; Seoul National University (SNU); Pennsylvania Commonwealth System of Higher Education (PCSHE); Pennsylvania State University; Pennsylvania State University - University Park; Pennsylvania Commonwealth System of Higher Education (PCSHE); Pennsylvania State University; Pennsylvania State University - University Park
摘要:We investigate high-dimensional nonconvex penalized regression, where the number of covariates may grow at an exponential rate. Although recent asymptotic theory established that there exists a local minimum possessing the oracle property under general conditions, it is still largely an open problem how to identify the oracle estimator among potentially multiple local minima. There are two main obstacles: (1) due to the presence of multiple minima, the solution path is nonunique and is not gua...
-
作者:Castillo, Ismael; Nickl, Richard
作者单位:Centre National de la Recherche Scientifique (CNRS); Sorbonne Universite; Universite Paris Cite; Centre National de la Recherche Scientifique (CNRS); University of Cambridge
摘要:Bernstein-von Mises theorems for nonparametric Bayes priors in the Gaussian white noise model are proved. It is demonstrated how such results justify Bayes methods as efficient frequentist inference procedures in a variety of concrete nonparametric problems. Particularly Bayesian credible sets are constructed that have asymptotically exact 1 - alpha frequentist coverage level and whose L-2-diameter shrinks at the minimax rate of convergence (within logarithmic factors) over Holder balls. Other...
-
作者:Lv, Jinchi
作者单位:University of Southern California
摘要:High-dimensional data sets are commonly collected in many contemporary applications arising in various fields of scientific research. We present two views of finite samples in high dimensions: a probabilistic one and a nonprobabilistic one. With the probabilistic view, we establish the concentration property and robust spark bound for large random design matrix generated from elliptical distributions, with the former related to the sure screening property and the latter related to sparse model...
-
作者:Fromont, Magalie; Laurent, Beatrice; Reynaud-Bouret, Patricia
作者单位:Universite de Rennes; Universite Rennes 2; Universite Cote d'Azur; Centre National de la Recherche Scientifique (CNRS)
摘要:Considering two independent Poisson processes, we address the question of testing equality of their respective intensities. We first propose testing procedures whose test statistics are U -statistics based on single kernel functions. The corresponding critical values are constructed from a nonasymptotic wild bootstrap approach, leading to level alpha tests. Various choices for the kernel functions are possible, including projection, approximation or reproducing kernels. In this last case, we o...
-
作者:Bacallado, Sergio; Favaro, Stefano; Trippa, Lorenzo
作者单位:Stanford University; University of Turin; Collegio Carlo Alberto; Harvard University; Harvard T.H. Chan School of Public Health; Harvard University; Harvard University Medical Affiliates; Dana-Farber Cancer Institute
摘要:We introduce a three-parameter random walk with reinforcement, called the (theta, alpha, beta) scheme, which generalizes the linearly edge reinforced random walk to uncountable spaces. The parameter beta smoothly tunes the (theta, alpha, beta) scheme between this edge reinforced random walk and the classical exchangeable two-parameter Hoppe urn scheme, while the parameters a and theta modulate how many states are typically visited. Resorting to de Finetti's theorem for Markov chains, we use th...
-
作者:Chung, EunYi; Romano, Joseph P.
作者单位:Stanford University; Stanford University
摘要:Given independent samples from P and Q, two-sample permutation tests allow one to construct exact level tests when the null hypothesis is P = Q. On the other hand, when comparing or testing particular parameters theta of P and Q, such as their means or medians, permutation tests need not be level a, or even approximately level alpha in large samples. Under very weak assumptions for comparing estimators, we provide a general test procedure whereby the asymptotic validity of the permutation test...
-
作者:Koltchinskii, Vladimir; Rangel, Pedro
作者单位:University System of Georgia; Georgia Institute of Technology
摘要:Let (V, A) be a weighted graph with a finite vertex set V, with a symmetric matrix of nonnegative weights A and with Laplacian Delta. Let S-* : V x V bar right arrow R be a symmetric kernel defined on the vertex set V. Consider n i.i.d. observations (X-j, X'(j), Y-j), j = 1, ..., n, where X-j, X'(j) are independent random vertices sampled from the uniform distribution in V and Y-j is an element of R is a real valued response variable such that E(Y-j vertical bar X-j, X'(j)) = S-*(X-j,X'(j)), j...
-
作者:Ma, Zongming
作者单位:University of Pennsylvania
摘要:Principal component analysis (PCA) is a classical dimension reduction method which projects data onto the principal subspace spanned by the leading eigenvectors of the covariance matrix. However, it behaves poorly when the number of features p is comparable to, or even much larger than, the sample size n. In this paper, we propose a new iterative thresholding approach for estimating principal subspaces in the setting where the leading eigenvectors are sparse. Under a spiked covariance model, w...