-
作者:Li, Xingxiang; Xu, Chen
作者单位:Xi'an Jiaotong University; University of Ottawa
摘要:Feature screening is a commonly used strategy to eliminate irrelevant features in high-dimensional classification. When one encounters big datasets with both high dimensionality and huge sample size, the conventional screening methods become computationally costly or even infeasible. In this article, we introduce a novel screening utility, Conditional Rank Utility (CRU), and propose a distributed feature screening procedure for the big-data classification. The proposed CRU effectively quantifi...
-
作者:Li, Cheng; Sun, Saifei; Zhu, Yichen
作者单位:National University of Singapore; Duke University
摘要:Spatial Gaussian process regression models typically contain finite dimensional covariance parameters that need to be estimated from the data. We study the Bayesian estimation of covariance parameters including the nugget parameter in a general class of stationary covariance functions under fixed-domain asymptotics, which is theoretically challenging due to the increasingly strong dependence among spatial observations. We propose a novel adaptation of the Schwartz's consistency theorem for sho...
-
作者:Lu, Zudi; Ren, Xiaohang; Zhang, Rongmao
作者单位:University of Southampton; University of Southampton; Central South University; Zhejiang Gongshang University; Zhejiang University; University of Southampton
摘要:Nonlinear dynamic modeling of spatio-temporal data is often a challenge, especially due to irregularly observed locations and location-wide nonstationarity. In this article we propose a semiparametric family of Dynamic Functional-coefficient Autoregressive Spatio-Temporal (DyFAST) models to address the difficulties. We specify the autoregressive smoothing coefficients depending dynamically on both a concerned regime and location so that the models can characterize not only the dynamic regime-s...
-
作者:Yu, Weichang; Bondell, Howard D.
作者单位:University of Melbourne
摘要:We develop a fast and accurate approach to approximate posterior distributions in the Bayesian empirical likelihood framework. Bayesian empirical likelihood allows for the use of Bayesian shrinkage without specification of a full likelihood but is notorious for leading to several computational difficulties. By coupling the stochastic variational Bayes procedure with an adjusted empirical likelihood framework, the proposed method overcomes the intractability of both the exact posterior and the ...
-
作者:Zhou, Le; Wang, Boxiang; Zou, Hui
作者单位:Hong Kong Baptist University; University of Iowa; University of Minnesota System; University of Minnesota Twin Cities
摘要:Wang et al. studied the high-dimensional sparse penalized rank regression and established its nice theoretical properties. Compared with the least squares, rank regression can have a substantial gain in estimation efficiency while maintaining a minimal relative efficiency of 86.4%. However, the computation of penalized rank regression can be very challenging for high-dimensional data, due to the highly nonsmooth rank regression loss. In this work we view the rank regression loss as a nonsmooth...
-
作者:Bhatia, Kush; Ma, Yi-An; Dragan, Anca D.; Bartlett, Peter L.; Jordan, Michael I.
作者单位:Stanford University; University of California System; University of California San Diego; University of California System; University of California Berkeley; University of California System; University of California Berkeley
摘要:We study the problem of robustly estimating the posterior distribution for the setting where observed data can be contaminated with potentially adversarial outliers. We propose Rob-ULA, a robust variant of the Unadjusted Langevin Algorithm (ULA), and provide a finite-sample analysis of its sampling distribution. In particular, we show that after T = O (d/eacc) iterations, we can sample from pT such that dist(pT, p*) = e(acc) + O(e), where e is the fraction of corruptions and dist represents th...
-
作者:Chen, Yudong; Wang, Tengyao; Samworth, Richard J.
作者单位:University of Cambridge; University of London; London School Economics & Political Science
摘要:We introduce and study two new inferential challenges associated with the sequential detection of change in a high-dimensional mean vector. First, we seek a confidence interval for the changepoint, and second, we estimate the set of indices of coordinates in which the mean changes. We propose an online algorithm that produces an interval with guaranteed nominal coverage, and whose length is, with high probability, of the same order as the average detection delay, up to a logarithmic factor. Th...
-
作者:He, Zhibing; Zhao, Yunpeng; Bickel, Peter; Weko, Charles; Cheng, Dan; Wang, Jirui
作者单位:Arizona State University; Arizona State University-Tempe; University of California System; University of California Berkeley; Arizona State University; Arizona State University-Tempe
摘要:Statistical network analysis primarily focuses on inferring the parameters of an observed network. In many applications, especially in the social sciences, the observed data is the groups formed by individual subjects. In these applications, the network is itself a parameter of a statistical model. Zhao and Weko propose a model-based approach, called the hub model, to infer implicit networks from grouping behavior. The hub model assumes that each member of the group is brought together by a me...
-
作者:Mukherjee, Somabha
作者单位:National University of Singapore
-
作者:Zhang, Bo; Pan, Guangming; Yao, Qiwei; Zhou, Wang
作者单位:Chinese Academy of Sciences; University of Science & Technology of China, CAS; Nanyang Technological University; University of London; London School Economics & Political Science; National University of Singapore
摘要:We propose a new unsupervised learning method for clustering a large number of time series based on a latent factor structure. Each cluster is characterized by its own cluster-specific factors in addition to some common factors which impact on all the time series concerned. Our setting also offers the flexibility that some time series may not belong to any clusters. The consistency with explicit convergence rates is established for the estimation of the common factors, the cluster-specific fac...