-
作者:Freund, Robert M.; Grigas, Paul; Mazumder, Rahul
作者单位:Massachusetts Institute of Technology (MIT); Massachusetts Institute of Technology (MIT); University of California System; University of California Berkeley
摘要:We analyze boosting algorithms [Ann. Statist. 29 (2001) 1189-1232; Ann. Statist. 28 (2000) 337-407; Ann. Statist. 32 (2004) 407-499] in linear regression from a new perspective: that of modern first-order methods in convex optimization. We show that classic boosting algorithms in linear regression, namely the incremental forward stagewise algorithm (FS epsilon) and least squares boosting [LS-BOOST(epsilon)], can be viewed as subgradient descent to minimize the loss function defined as the maxi...
-
作者:Liu, Weidong
作者单位:Shanghai Jiao Tong University; Shanghai Jiao Tong University
摘要:We present a new framework on inferring structural similarities and differences among multiple high-dimensional Gaussian graphical models (GGMs) corresponding to the same set of variables under distinct experimental conditions. The new framework adopts the partial correlation coefficients to characterize the potential changes of dependency strengths between two variables. A hierarchical method has been further developed to recover edges with different or similar dependency strengths across mul...
-
作者:Ray, Kolyan
作者单位:Leiden University; Leiden University - Excl LUMC
摘要:We investigate Bernstein-von Mises theorems for adaptive nonparametric Bayesian procedures in the canonical Gaussian white noise model. We consider both a Hilbert space and multiscale setting with applications in L-2 and L-infinity, respectively. This provides a theoretical justification for plug-in procedures, for example the use of certain credible sets for sufficiently smooth linear functionals. We use this general approach to construct optimal frequentist confidence sets based on the poste...
-
作者:Lok, Judith J.
作者单位:Harvard University; Harvard T.H. Chan School of Public Health
摘要:In observational studies, treatment may be adapted to covariates at several times without a fixed protocol, in continuous time. Treatment influences covariates, which influence treatment, which influences covariates and so on. Then even time-dependent Cox-models cannot be used to estimate the net treatment effect. Structural nested models have been applied in this setting. Structural nested models are based on counterfactuals: the outcome a person would have had had treatment been withheld aft...
-
作者:Rousseau, Judith; Szabo, Botond
作者单位:Universite PSL; Universite Paris-Dauphine; Institut Polytechnique de Paris; ENSAE Paris; Budapest University of Technology & Economics; Leiden University - Excl LUMC; Leiden University
摘要:We consider the asymptotic behaviour of the marginal maximum likelihood empirical Bayes,posterior distribution in general setting. First, we characterize the set where the maximum marginal likelihood estimator is located with high probability. Then we provide oracle type of upper and lower bounds for the contraction rates of the empirical Bayes posterior. We also show that the hierarchical Bayes posterior achieves the same contraction rate as the maximum marginal likelihood empirical Bayes pos...
-
作者:Hang, Hanyuan; Steinwart, Ingo
作者单位:University of Stuttgart
摘要:We establish a Bernstein-type inequality for a class of stochastic processes that includes the classical geometrically phi-mixing processes, Rio's generalization of these processes and many time-discrete dynamical systems. Modulo a logarithmic factor and some constants, our Bernstein-type inequality coincides with the classical Bernstein inequality for i.i.d. data. We further use this new Bernstein-type inequality to derive an oracle inequality for generic regularized empirical risk minimizati...
-
作者:Koltchinskii, Vladimir; Lounici, Karim
作者单位:University System of Georgia; Georgia Institute of Technology
摘要:Let X, X-1,...,X-n be i.i.d. Gaussian random variables in a separable Hilbert space H with zero mean and covariance operator Sigma = E(X circle times X), and let (Sigma) over cap := n(-1) Sigma(n)(j=1) (X-i circle times X-j) be the sample (empirical) covariance operator based on (XI,..,Xn). Denote by P-r the spectral projector of Sigma corresponding to its rth eigenvalue mu(r) and by (P-r) over cap the empirical counterpart of P-r. The main goal of the paper is to obtain tight bounds on sup(x ...
-
作者:Chan, Hock Peng
作者单位:National University of Singapore
摘要:Consider a large number of detectors each generating a data stream. The task is to detect online, distribution changes in a small fraction of the data streams. Previous approaches to this problem include the use of mixture likelihood ratios and sum of CUSUMs. We provide here extensions and modifications of these approaches that are optimal in detecting normal mean shifts. We show how the optimal) detection delay depends on the fraction of data streams undergoing distribution changes as the num...
-
作者:Aston, John A. D.; Pigoli, Davide; Tavakoli, Shahin
作者单位:University of Cambridge
摘要:The assumption of separability of the covariance operator for a random image or hypersurface can be of substantial use in applications, especially in situations where the accurate estimation of the full covariance structure is unfeasible, either for computational reasons, or due to a small sample size. However, inferential tools to verify this assumption are somewhat lacking in high-dimensional or functional data analysis settings, where this assumption is most relevant. We propose here to tes...
-
作者:Ando, Tomohiro; Li, Ker-Chau
作者单位:University of Melbourne; University of California System; University of California Los Angeles; Academia Sinica - Taiwan
摘要:Model averaging has long been proposed as a powerful alternative to model selection in regression analysis. However, how well it performs in high-dimensional regression is still poorly understood. Recently, Ando and Li [J. Amer. Statist. Assoc. 109 (2014) 254-265] introduced a new method of model averaging that allows the number of predictors to increase as the sample size increases. One notable feature of Ando and Li's method is the relaxation on the total model weights so that weak signals c...