-
作者:Mariadassou, Mahendra; Robin, Stephane; Vacher, Corinne
作者单位:Universite Paris Saclay; AgroParisTech; INRAE; Universite de Bordeaux; INRAE
摘要:As more and more network-structured data sets are available, the statistical analysis of valued graphs has become common place. Looking for a latent structure is one of the many strategies used to better understand the behavior of a network. Several methods already exist for the binary case. We present a model-based strategy to uncover groups of nodes in valued graphs. This framework can be used for a wide span of parametric random graphs models and allows to include covariates. Variational to...
-
作者:Warton, David I.; Shepherd, Leah C.
作者单位:University of New South Wales Sydney; University of New South Wales Sydney
摘要:Presence-only data, point locations where a species has been recorded as being present, are often used in modeling the distribution of a species as a function of a set of explanatory variables-whether to map species occurrence, to understand its association with the environment, or to predict its response to environmental change. Currently, ecologists most commonly analyze presence-only data by adding randomly chosen pseudo-absences to the data such that it can be analyzed using logistic regre...
-
作者:Gertheiss, Jan; Tutz, Gerhard
作者单位:University of Munich
摘要:Shrinking methods in regression analysis are usually designed for metric predictors. In this article, however, shrinkage methods for categorial predictors are proposed. As an application we consider data from the Munich rent standard, where, for example, urban districts are treated as a categorial predictor. If independent variables are categorial, some modifications to usual shrinking procedures are necessary. Two L-1-penalty based methods for factor selection and clustering of categories are...
-
作者:Fienberg, Stephen E.
作者单位:Carnegie Mellon University; Carnegie Mellon University
-
作者:Lu, Qiqi; Lund, Robert; Lee, Thomas C. M.
作者单位:Mississippi State University; Clemson University; Colorado State University System; Colorado State University Fort Collins; Chinese University of Hong Kong
摘要:This paper proposes an information theory approach to estimate the number of changepoints and their locations in a climatic time series. A model is introduced that has an unknown number of changepoints and allows for series autocorrelations, periodic dynamics, and a mean shift at each changepoint time. An objective function gauging the number of changepoints and their locations, based on a minimum description length (MDL) information criterion, is derived. A genetic algorithm is then developed...
-
作者:Berrocal, Veronica J.; Gelfand, Alan E.; Holland, David M.
作者单位:University of Michigan System; University of Michigan; Duke University; United States Environmental Protection Agency
摘要:Ozone and particulate matter, PM2.5, are co-pollutants that have long been associated with increased public health risks. Information on concentration levels for both pollutants comes from two sources: monitoring sites and output from complex numerical models that produce concentration surfaces over large spatial regions. In this paper, we offer a fully-model-based approach for fusing these two sources of information for the pair of co-pollutants which is computationally feasible over large sp...
-
作者:Zhang, Tingting; Kou, S. C.
作者单位:University of Virginia
摘要:Doubly stochastic Poisson processes, also known as the Cox processes, frequently occur in various scientific fields. In this article, motivated primarily by analyzing Cox process data in biophysics, we propose a nonparametric kernel-based inference method. We conduct a detailed study, including an asymptotic analysis, of the proposed method, and provide guidelines for its practical use, introducing a fast and stable regression method for bandwidth selection. We apply our method to real photon ...
-
作者:Lau, Ada; Mcsharry, Patrick
作者单位:University of Oxford; University of Oxford
摘要:The generation of multi-step density forecasts for non-Gaussian data mostly relies on Monte Carlo simulations which are computationally intensive. Using aggregated wind power in Ireland, we study two approaches of multi-step density forecasts which can be obtained from simple iterations so that intensive computations are avoided. In the first approach, we apply a logistic transformation to normalize the data approximately and describe the transformed data using ARIMA-GARCH models so that multi...
-
作者:Meinshausen, Nicolai
作者单位:University of Oxford
摘要:When choosing a suitable technique for regression and classification with multivariate predictor variables, one is often faced with a tradeoff between interpretability and high predictive accuracy. To give a classical example, classification and regression trees are easy to understand and interpret. Tree ensembles like Random Forests provide usually more accurate predictions. Yet tree ensembles are also more difficult to analyze than single trees and are often criticized, perhaps unfairly, as ...
-
作者:Bissantz, Nicolai; Holzmann, Hajo; Pawlak, Miroslaw
作者单位:Ruhr University Bochum; Philipps University Marburg; University of Manitoba
摘要:A method for estimating the axis of reflectional symmetry of an image f (x, y) on the unit disc D = {(x, y) : x(2) + y(2) <= 1} is proposed, given that noisy data of f (x, y) are observed on a discrete grid of edge width Delta. Our estimation procedure is based on minimizing over beta is an element of [0, pi) the L-2 distance between empirical versions of f and tau(beta)f, the image of f after reflection at the axis along (cos beta, sin beta). Here, f and tau(beta)f are estimated using truncat...