-
作者:Li, Xingxiang; Xu, Chen
作者单位:Xi'an Jiaotong University; University of Ottawa
摘要:Feature screening is a commonly used strategy to eliminate irrelevant features in high-dimensional classification. When one encounters big datasets with both high dimensionality and huge sample size, the conventional screening methods become computationally costly or even infeasible. In this article, we introduce a novel screening utility, Conditional Rank Utility (CRU), and propose a distributed feature screening procedure for the big-data classification. The proposed CRU effectively quantifi...
-
作者:Li, Cheng; Sun, Saifei; Zhu, Yichen
作者单位:National University of Singapore; Duke University
摘要:Spatial Gaussian process regression models typically contain finite dimensional covariance parameters that need to be estimated from the data. We study the Bayesian estimation of covariance parameters including the nugget parameter in a general class of stationary covariance functions under fixed-domain asymptotics, which is theoretically challenging due to the increasingly strong dependence among spatial observations. We propose a novel adaptation of the Schwartz's consistency theorem for sho...
-
作者:Lu, Zudi; Ren, Xiaohang; Zhang, Rongmao
作者单位:University of Southampton; University of Southampton; Central South University; Zhejiang Gongshang University; Zhejiang University; University of Southampton
摘要:Nonlinear dynamic modeling of spatio-temporal data is often a challenge, especially due to irregularly observed locations and location-wide nonstationarity. In this article we propose a semiparametric family of Dynamic Functional-coefficient Autoregressive Spatio-Temporal (DyFAST) models to address the difficulties. We specify the autoregressive smoothing coefficients depending dynamically on both a concerned regime and location so that the models can characterize not only the dynamic regime-s...
-
作者:Yu, Weichang; Bondell, Howard D.
作者单位:University of Melbourne
摘要:We develop a fast and accurate approach to approximate posterior distributions in the Bayesian empirical likelihood framework. Bayesian empirical likelihood allows for the use of Bayesian shrinkage without specification of a full likelihood but is notorious for leading to several computational difficulties. By coupling the stochastic variational Bayes procedure with an adjusted empirical likelihood framework, the proposed method overcomes the intractability of both the exact posterior and the ...
-
作者:Liu, Bingyuan; Zhang, Qi; Xue, Lingzhou; Song, Peter X. -K.; Kang, Jian
作者单位:Pennsylvania Commonwealth System of Higher Education (PCSHE); Pennsylvania State University; Pennsylvania State University - University Park; University of Michigan System; University of Michigan
摘要:It is important to develop statistical techniques to analyze high-dimensional data in the presence of both complex dependence and possible heavy tails and outliers in real-world applications such as imaging data analyses. We propose a new robust high-dimensional regression with coefficient thresholding, in which an efficient nonconvex estimation procedure is proposed through a thresholding function and the robust Huber loss. The proposed regularization method accounts for complex dependence st...
-
作者:Demirkaya, Emre; Fan, Yingying; Gao, Lan; Lv, Jinchi; Vossler, Patrick; Wang, Jingbo
作者单位:University of Tennessee System; University of Tennessee Knoxville; University of Southern California; Chinese University of Hong Kong
摘要:The weighted nearest neighbors (WNN) estimator has been popularly used as a flexible and easy-to-implement nonparametric tool for mean regression estimation. The bagging technique is an elegant way to form WNN estimators with weights automatically generated to the nearest neighbors; we name the resulting estimator as the distributional nearest neighbors (DNN) for easy reference. Yet, there is a lack of distributional results for such estimator, limiting its application to statistical inference...
-
作者:Awan, Jordan; Wang, Zhanyu
摘要:Privacy protection methods, such as differentially private mechanisms, introduce noise into resulting statistics which often produces complex and intractable sampling distributions. In this article, we propose a simulation-based repro sample approach to produce statistically valid confidence intervals and hypothesis tests, which builds on the work of Xie and Wang. We show that this methodology is applicable to a wide variety of private inference problems, appropriately accounts for biases intr...
-
作者:Vogels, Lucas; Mohammadi, Reza; Schoonhoven, Marit; Birbil, S. Ilker
作者单位:University of Amsterdam
摘要:Gaussian graphical models provide a powerful framework to reveal the conditional dependency structure between multivariate variables. The process of uncovering the conditional dependency network is known as structure learning. Bayesian methods can measure the uncertainty of conditional relationships and include prior information. However, frequentist methods are often preferred due to the computational burden of the Bayesian approach. Over the last decade, Bayesian methods have seen substantia...
-
作者:Astfalck, Lachlan; Williamson, Daniel; Gandy, Niall; Gregoire, Lauren; Ivanovic, Ruza
作者单位:University of Leeds; University of Exeter; Alan Turing Institute; University of Western Australia
摘要:Any experiment with climate models relies on a potentially large set of spatio-temporal boundary conditions. These can represent both the initial state of the system and/or forcings driving the model output throughout the experiment. These boundary conditions are typically fixed using available reconstructions in climate modeling studies; however, in reality they are highly uncertain, that uncertainty is unquantified, and the effect on the output of the experiment can be considerable. We devel...
-
作者:Fang, Guanhua; Xu, Ganggang; Xu, Haochen; Zhu, Xuening; Guan, Yongtao
作者单位:Fudan University; University of Miami; The Chinese University of Hong Kong, Shenzhen; Shenzhen Research Institute of Big Data; Fudan University
摘要:In this work, we study the event occurrences of individuals interacting in a network. To characterize the dynamic interactions among the individuals, we propose a group network Hawkes process (GNHP) model whose network structure is observed and fixed. In particular, we introduce a latent group structure among individuals to account for the heterogeneous user-specific characteristics. A maximum likelihood approach is proposed to simultaneously cluster individuals in the network and estimate mod...