-
作者:Jewell, Sean; Fearnhead, Paul; Witten, Daniela
作者单位:University of Washington; University of Washington Seattle; Lancaster University; University of Washington; University of Washington Seattle
摘要:While many methods are available to detect structural changes in a time series, few procedures are available to quantify the uncertainty of these estimates post-detection. In this work, we fill this gap by proposing a new framework to test the null hypothesis that there is no change in mean around an estimated changepoint. We further show that it is possible to efficiently carry out this framework in the case of changepoints estimated by binary segmentation and its variants, l0 segmentation, o...
-
作者:Guillaumin, Arthur P.; Sykulski, Adam M.; Olhede, Sofia C.; Simons, Frederik J.
作者单位:University of London; Queen Mary University London; Lancaster University; Swiss Federal Institutes of Technology Domain; Ecole Polytechnique Federale de Lausanne; University of London; University College London; Princeton University
摘要:We provide a computationally and statistically efficient method for estimating the parameters of a stochastic covariance model observed on a regular spatial grid in any number of dimensions. Our proposed method, which we call the Debiased Spatial Whittle likelihood, makes important corrections to the well-known Whittle likelihood to account for large sources of bias caused by boundary effects and aliasing. We generalize the approach to flexibly allow for significant volumes of missing data inc...
-
作者:Jiang, Zhichao; Yang, Shu; Ding, Peng
作者单位:University of Massachusetts System; University of Massachusetts Amherst; North Carolina State University; University of California System; University of California Berkeley
摘要:Causal inference concerns not only the average effect of the treatment on the outcome but also the underlying mechanism through an intermediate variable of interest. Principal stratification characterizes such a mechanism by targeting subgroup causal effects within principal strata, which are defined by the joint potential values of an intermediate variable. Due to the fundamental problem of causal inference, principal strata are inherently latent, rendering it challenging to identify and esti...
-
作者:Gronsbell, Jessica; Liu, Molei; Tian, Lu; Cai, Tianxi
作者单位:University of Toronto; Harvard University; Stanford University; Harvard University; Harvard Medical School
摘要:In many contemporary applications, large amounts of unlabelled data are readily available while labelled examples are limited. There has been substantial interest in semi-supervised learning (SSL) which aims to leverage unlabelled data to improve estimation or prediction. However, current SSL literature focuses primarily on settings where labelled data are selected uniformly at random from the population of interest. Stratified sampling, while posing additional analytical challenges, is highly...
-
作者:Li, Didong; Mukhopadhyay, Minerva; Dunson, David B.
作者单位:Princeton University; University of California System; University of California Los Angeles; Indian Institute of Technology System (IIT System); Indian Institute of Technology (IIT) - Kanpur; Duke University
摘要:In statistical dimensionality reduction, it is common to rely on the assumption that high dimensional data tend to concentrate near a lower dimensional manifold. There is a rich literature on approximating the unknown manifold, and on exploiting such approximations in clustering, data compression, and prediction. Most of the literature relies on linear or locally linear approximations. In this article, we propose a simple and general alternative, which instead uses spheres, an approach we refe...
-
作者:Moscovich, Amit; Rosset, Saharon
作者单位:Tel Aviv University
摘要:Cross-validation is the de facto standard for predictive model evaluation and selection. In proper use, it provides an unbiased estimate of a model's predictive performance. However, data sets often undergo various forms of data-dependent preprocessing, such as mean-centring, rescaling, dimensionality reduction and outlier removal. It is often believed that such preprocessing stages, if done in an unsupervised manner (that does not incorporate the class labels or response values) are generally...
-
作者:Rubin-Delanchy, Patrick; Cape, Joshua; Tang, Minh; Priebe, Carey E.
作者单位:University of Bristol; Pennsylvania Commonwealth System of Higher Education (PCSHE); University of Pittsburgh; North Carolina State University; Johns Hopkins University
摘要:Spectral embedding is a procedure which can be used to obtain vector representations of the nodes of a graph. This paper proposes a generalisation of the latent position network model known as the random dot product graph, to allow interpretation of those vector representations as latent position estimates. The generalisation is needed to model heterophilic connectivity (e.g. 'opposites attract') and to cope with negative eigenvalues more generally. We show that, whether the adjacency or norma...
-
作者:de Fondeville, Raphael; Davison, Anthony C.
作者单位:Swiss Federal Institutes of Technology Domain; Ecole Polytechnique Federale de Lausanne
摘要:Peaks-over-threshold analysis using the generalised Pareto distribution is widely applied in modelling tails of univariate random variables, but much information may be lost when complex extreme events are studied using univariate results. In this paper, we extend peaks-over-threshold analysis to extremes of functional data. Threshold exceedances defined using a functional r are modelled by the generalised r-Pareto process, a functional generalisation of the generalised Pareto distribution tha...
-
作者:Rosset, Saharon; Heller, Ruth; Painsky, Amichai; Aharoni, Ehud
作者单位:Tel Aviv University; Tel Aviv University; International Business Machines (IBM); IBM ISRAEL
摘要:Multiple testing problems (MTPs) are a staple of modern statistical analysis. The fundamental objective of MTPs is to reject as many false null hypotheses as possible (that is, maximize some notion of power), subject to controlling an overall measure of false discovery, like family-wise error rate (FWER) or false discovery rate (FDR). In this paper we provide generalizations to MTPs of the optimal Neyman-Pearson test for a single hypothesis. We show that for simple hypotheses, for both FWER an...
-
作者:Wang, Zhonglei; Peng, Liuhua; Kim, Jae Kwang
作者单位:Xiamen University; Xiamen University; University of Melbourne; Iowa State University
摘要:Bootstrap is a useful computational tool for statistical inference, but it may lead to erroneous analysis under complex survey sampling. In this paper, we propose a unified bootstrap method for stratified multi-stage cluster sampling, Poisson sampling, simple random sampling without replacement and probability proportional to size sampling with replacement. In the proposed bootstrap method, we first generate bootstrap finite populations, apply the same sampling design to each bootstrap populat...