-
作者:Mohler, G. O.; Short, M. B.; Brantingham, P. J.; Schoenberg, F. P.; Tita, G. E.
作者单位:Santa Clara University; University of California System; University of California Los Angeles; University of California System; University of California Los Angeles; University of California System; University of California Los Angeles; University of California System; University of California Irvine
摘要:Highly clustered event sequences are observed in certain types of crime data, such as burglary and gang violence, due to crime-specific patterns of criminal behavior. Similar clustering patterns are observed by seismologists, as earthquakes are well known to increase the risk of subsequent earthquakes, or aftershocks, near the location of an initial event. Space time clustering is modeled in seismology by self-exciting point processes and the focus of this article is to show that these methods...
-
作者:Datta, Gauri S.; Hall, Peter; Mandal, Abhyuday
作者单位:University System of Georgia; University of Georgia; University of Melbourne
摘要:The models used in small-area inference often involve unobservable random effects. While this can significantly improve the adaptivity and flexibility of a model, it also increases the variability of both point and interval estimators. If we could test for the existence of the random effects, and if the test were to show that they were unlikely to be present, then we would arguably not need to incorporate them into the model, and thus could significantly improve the precision of the methodolog...
-
作者:Taddy, Matthew A.; Gramacy, Robert B.; Polson, Nicholas G.
作者单位:University of Chicago; University of Cambridge
摘要:Dynamic regression trees are an attractive option for automatic regression and classification with complicated response surfaces in online application settings. We create a sequential tree model whose state changes in time with the accumulation of new data, and provide particle learning algorithms that allow for the efficient online posterior filtering of tree states. A major advantage of tree regression is that it allows for the use of very simple models within each partition. The model also ...
-
作者:Chen, Kun; Chen, Kehui; Mueller, Hans-Georg; Wang, Jane-Ling
作者单位:Harvard University; Harvard University Medical Affiliates; Dana-Farber Cancer Institute; University of California System; University of California Davis
摘要:We propose stringing, a class of methods where one views high-dimensional observations as functional data. Stringing takes advantage of the high dimension by representing such data as discretized and noisy observations that originate from a hidden smooth stochastic process. Assuming that the observations result from scrambling the original ordering of the observations of the process, stringing reorders the components of the high-dimensional vectors, followed by transforming the high-dimensiona...
-
作者:Woodard, Dawn B.; Goldszmidt, Moises
作者单位:Cornell University; Microsoft
摘要:Large-scale distributed computing systems can suffer from occasional severe violation of performance goals; due to the complexity of these systems, manual diagnosis of the cause of the crisis is too slow to inform interventions taken during the crisis. Rapid automatic recognition of the recurrence of a problem can lead to cause diagnosis and informed intervention. We frame this as an online clustering problem, where the labels (causes) of some of the previous crises may be known. We give a fas...
-
作者:Efromovich, Sam
摘要:Nonparametric regression with predictors missing at random (MAR), where the probability of missing depends only on observed variables, is considered. Univariate predictor is the primary case of interest. A new adaptive orthogonal series estimator is developed. Large sample theory shows that the estimator is rate-minimax and it is also sharp-minimax whenever predictors are missing completely at random (MCAR). Furthermore, confidence bands, estimation of nuisance functions, including conditional...
-
作者:Gile, Krista J.
作者单位:University of Massachusetts System; University of Massachusetts Amherst
摘要:Respondent-driven sampling is a form of link-tracing network sampling, which is widely used to study hard-to-reach populations, often to estimate population proportions. Previous treatments of this process have used a with-replacement approximation, which we show induces bias in estimates for large sample fractions and differential network connectedness by characteristic of interest. We present a treatment of respondent-driven sampling as a successive sampling process. Unlike existing represen...
-
作者:Robbins, Michael W.; Lund, Robert B.; Gallagher, Colin M.; Lu, Qiqi
作者单位:Clemson University; Mississippi State University
摘要:This article examines the North Atlantic tropical cyclone record for statistical discontinuities (changepoints). This is a controversial area and indeed, our end conclusions are opposite of those made in Dr. Kelvin Droegemeier's July 28, 2009 Senate testimonial. The methods developed here should help rigorize the debate. Elaborating, we develop a level-alpha test for a changepoint in a categorical data sequence sampled from a multinomial distribution. The proposed test statistic is the maximum...
-
作者:Iacus, Stefano M.; King, Gary; Porro, Giuseppe
作者单位:University of Milan; Harvard University; University of Trieste
摘要:We introduce a new Monotonic Imbalance Bounding (MIB) class of matching methods for causal inference with a surprisingly large number of attractive statistical properties. MIB generalizes and extends in several new directions the only existing class, Equal Percent Bias Reducing (EPBR), which is designed to satisfy weaker properties and only in expectation. We also offer strategies to obtain specific members of the MIB class, and analyze in more detail a member of this class, called Coarsened E...
-
作者:Xie, Minge; Singh, Kesar; Strawderman, William E.
作者单位:Rutgers University System; Rutgers University New Brunswick
摘要:This article develops a unifying framework, as well as robust meta-analysis approaches, for combining studies from independent sources. The device used in this combination is a confidence distribution (CD), which uses a distribution function, instead of a point (point estimator) or an interval (confidence interval), to estimate a parameter of interest. A CD function contains a wealth of information for inferences, and it is a useful device for combining studies from different sources. The prop...