-
作者:Freund, Robert M.; Grigas, Paul; Mazumder, Rahul
作者单位:Massachusetts Institute of Technology (MIT); Massachusetts Institute of Technology (MIT); University of California System; University of California Berkeley
摘要:We analyze boosting algorithms [Ann. Statist. 29 (2001) 1189-1232; Ann. Statist. 28 (2000) 337-407; Ann. Statist. 32 (2004) 407-499] in linear regression from a new perspective: that of modern first-order methods in convex optimization. We show that classic boosting algorithms in linear regression, namely the incremental forward stagewise algorithm (FS epsilon) and least squares boosting [LS-BOOST(epsilon)], can be viewed as subgradient descent to minimize the loss function defined as the maxi...
-
作者:Liu, Weidong
作者单位:Shanghai Jiao Tong University; Shanghai Jiao Tong University
摘要:We present a new framework on inferring structural similarities and differences among multiple high-dimensional Gaussian graphical models (GGMs) corresponding to the same set of variables under distinct experimental conditions. The new framework adopts the partial correlation coefficients to characterize the potential changes of dependency strengths between two variables. A hierarchical method has been further developed to recover edges with different or similar dependency strengths across mul...
-
作者:Ray, Kolyan
作者单位:Leiden University; Leiden University - Excl LUMC
摘要:We investigate Bernstein-von Mises theorems for adaptive nonparametric Bayesian procedures in the canonical Gaussian white noise model. We consider both a Hilbert space and multiscale setting with applications in L-2 and L-infinity, respectively. This provides a theoretical justification for plug-in procedures, for example the use of certain credible sets for sufficiently smooth linear functionals. We use this general approach to construct optimal frequentist confidence sets based on the poste...
-
作者:Chan, Hock Peng
作者单位:National University of Singapore
摘要:Consider a large number of detectors each generating a data stream. The task is to detect online, distribution changes in a small fraction of the data streams. Previous approaches to this problem include the use of mixture likelihood ratios and sum of CUSUMs. We provide here extensions and modifications of these approaches that are optimal in detecting normal mean shifts. We show how the optimal) detection delay depends on the fraction of data streams undergoing distribution changes as the num...
-
作者:Ando, Tomohiro; Li, Ker-Chau
作者单位:University of Melbourne; University of California System; University of California Los Angeles; Academia Sinica - Taiwan
摘要:Model averaging has long been proposed as a powerful alternative to model selection in regression analysis. However, how well it performs in high-dimensional regression is still poorly understood. Recently, Ando and Li [J. Amer. Statist. Assoc. 109 (2014) 254-265] introduced a new method of model averaging that allows the number of predictors to increase as the sample size increases. One notable feature of Ando and Li's method is the relaxation on the total model weights so that weak signals c...
-
作者:Datta, Abhirup; Zou, Hui
作者单位:Johns Hopkins University; University of Minnesota System; University of Minnesota Twin Cities
摘要:Much theoretical and applied work has been devoted to high-dimensional regression with clean data. However, we often face corrupted data in many applications where missing data and measurement errors cannot be ignored. Loh and Wainwright [Ann. Statist. 40 (2012) 1637-1664] proposed a non-convex modification of the Lasso for doing high-dimensional regression with noisy and missing data. It is generally agreed that the virtues of convexity contribute fundamentally the success and popularity of t...
-
作者:Ning, Yang; Zhao, Tianqi; Liu, Han
作者单位:Cornell University; Princeton University
摘要:We propose a new inferential framework for high-dimensional semiparametric generalized linear models. This framework addresses a variety of challenging problems in high-dimensional data analysis, including incomplete data, selection bias and heterogeneity. Our work has three main contributions: (i) We develop a regularized statistical chromatography approach to infer the parameter of interest under the proposed semiparametric generalized linear model without the need of estimating the unknown ...
-
作者:Lin, Yuan-Lung; Phoa, Frederick Kin Hing; Kao, Ming-Hung
作者单位:Academia Sinica - Taiwan; Arizona State University; Arizona State University-Tempe
摘要:Functional magnetic resonance imaging (fMRI) is a pioneering technology for studying brain activity in response to mental stimuli. Although efficient designs on these fMRI experiments are important for rendering precise statistical inference on brain functions, they are not systematically constructed. Design with circulant property is crucial for estimating a hemo-dynamic response function (HRF) and discussing fMRI experimental optimality. In this paper, we develop a theory that not only succe...
-
作者:Mousavi, Ali; Maleki, Arian; Baraniuk, Richard G.
作者单位:Rice University; Columbia University
摘要:This paper studies the optimal tuning of the regularization parameter in LASSO or the threshold parameters in approximate message passing (AMP). Considering a model in which the design matrix and noise are zero-mean i.i.d. Gaussian, we propose a data-driven approach for estimating the regularization parameter of LASSO and the threshold parameters in AMP. Our estimates are consistent, that is, they converge to their asymptotically optimal values in probability as n, the number of observations, ...
-
作者:Anevski, Dragi; Gill, Richard D.; Zohren, Stefan
作者单位:Lund University; Leiden University - Excl LUMC; Leiden University; University of Oxford
摘要:In the context of a species sampling problem, we discuss a nonparametric maximum likelihood estimator for the underlying probability mass function. The estimator is known in the computer science literature as the high profile estimator. We prove strong consistency and derive the rates of convergence, for an extended model version of the estimator. We also study a sieved estimator for which similar consistency results are derived. Numerical computation of the sieved estimator is of great intere...