-
作者:Hanks, Ephraim M. u; Hooten, Mevin B.; Knick, Steven T.; Oyler-McCance, Sara J.; Fike, Jennifer A.; Cross, Todd B.; Schwartz, Michael K.
作者单位:Pennsylvania Commonwealth System of Higher Education (PCSHE); Pennsylvania State University; Pennsylvania State University - University Park; United States Department of the Interior; United States Geological Survey; Colorado State University System; Colorado State University Fort Collins; United States Department of the Interior; United States Geological Survey; United States Department of the Interior; United States Geological Survey; United States Department of the Interior; United States Geological Survey; University of Montana System; University of Montana; United States Department of Agriculture (USDA); United States Forest Service
摘要:We propose a spatially-explicit approach for modeling genetic variation across space and illustrate how this approach can be used to optimize spatial prediction and sampling design for landscape genetic data. We propose a multinomial data model for categorical microsatellite allele data commonly used in landscape genetic studies, and introduce a latent spatial random effect to allow for spatial correlation between genetic observations. We illustrate how modern dimension reduction approaches to...
-
作者:Goel, Sharad; Rao, Justin M.; Shroff, Ravi
作者单位:Stanford University; Microsoft; New York University
摘要:Recent studies have examined racial disparities in stop-and-frisk, a widely employed but controversial policing tactic. The statistical evidence, however, has been limited and contradictory. We investigate by analyzing three million stops in New York City over five years, focusing on cases where officers suspected the stopped individual of criminal possession of a weapon (CPW). For each CPW stop, we estimate the ex ante probability that the detained suspect has a weapon. We find that in more t...
-
作者:Bolsinova, Maria; Maris, Gunter; Hoijtink, Herbert
作者单位:Utrecht University; Utrecht University; University of Amsterdam
摘要:One of the important questions in the practice of educational testing is how a particular test should be scored. In this paper we consider what an appropriate simple scoring rule should be for the Dutch as a second language test consisting of listening and reading items. As in many other applications, here the Rasch model which allows to score the test with a simple sumscore is too restrictive to adequately represent the data. In this study we propose an exploratory algorithm which clusters th...
-
作者:Han, Fang; Han, Xiaoyan; Liu, Han; Caffo, Brian
作者单位:Johns Hopkins University; Johns Hopkins University; Princeton University; University of Washington; University of Washington Seattle
摘要:We propose a unified framework for conducting inference on complex aggregated data in high-dimensional settings. We assume the data are a collection of multiple non-Gaussian realizations with underlying undirected graphical structures. Using the concept of median graphs in summarizing the commonality across these graphical structures, we provide a novel semipara-metric approach to modeling such complex aggregated data, along with robust estimation of the median graph, which is assumed to be sp...
-
作者:Johnston, Ian; Hancock, Timothy; Mamitsuka, Hiroshi; Carvalho, Luis
作者单位:Boston University; Department of Primary Industries & Regional Development NSW; Kyoto University
摘要:Motivated by the important problem of detecting association between genetic markers and binary traits in genome-wide association studies, we present a novel Bayesian model that establishes a hierarchy between markers and genes by defining weights according to gene lengths and distances from genes to markers. The proposed hierarchical model uses these weights to define unique prior probabilities of association for markers based on their proximities to genes that are believed to be relevant to t...
-
作者:Russell, Brook T.; Cooley, Daniel S.; Porter, William C.; Reich, Brian J.; Heald, Colette L.
作者单位:Clemson University; Colorado State University System; Colorado State University Fort Collins; Massachusetts Institute of Technology (MIT); North Carolina State University; Massachusetts Institute of Technology (MIT)
摘要:This project aims to explore which combinations of meteorological conditions are associated with extreme ground level ozone conditions. Our approach focuses only on the tail by optimizing the tail dependence between the ozone response and functions of meteorological covariates. Since there is a long list of possible meteorological covariates, the space of possible models cannot be explored completely. Consequently, we perform data mining within the model selection context, employing an automat...
-
作者:Bendich, Paul; Marron, J. S.; Miller, Ezra; Pieloch, Alex; Skwerer, Sean
作者单位:Duke University; University of North Carolina; University of North Carolina Chapel Hill; Yale University
摘要:New representations of tree-structured data objects, using ideas from topological data analysis, enable improved statistical analyses of a population of brain artery trees. A number of representations of each data tree arise from persistence diagrams that quantify branching and looping of vessels at multiple scales. Novel approaches to the statistical analysis, through various summaries of the persistence diagrams, lead to heightened correlations with covariates such as age and sex, relative t...
-
作者:Zuo, Chandler; Chen, Kailei; Hewitt, Kyle J.; Bresnick, Emery H.; Keles, Sunduz
作者单位:University of Wisconsin System; University of Wisconsin Madison; University of Wisconsin System; University of Wisconsin Madison; University of Wisconsin System; University of Wisconsin Madison
摘要:Integrative analysis of multiple experimental datasets measured over a large number of observational units is the focus of large numbers of contemporary genomic and epigenomic studies. The key objectives of such studies include not only inferring a hidden state of activity for each unit over individual experiments, but also detecting highly associated clusters of units based on their inferred states. Although there are a number of methods tailored for specific datasets, there is currently no s...
-
作者:Gaetan, Carlo; Girardi, Paolo; Pastres, Roberto; Mangin, Antoine
作者单位:Universita Ca Foscari Venezia
摘要:The use of water quality indicators is of crucial importance to identify risks to the environment, society and human health. In particular, the Chlorophyll type A (Chl-a) is a shared indicator of trophic status and for monitoring activities it may be useful to discover local dangerous behaviours (for example, the anoxic events). In this paper we consider a comprehensive data set, covering the whole Adriatic Sea, derived from Ocean Colour satellite data, during the period 2002-2012, with the ai...
-
作者:Zhang, Nancy R.; Yakir, Benjamin; Xia, Li C.; Siegmund, David
作者单位:University of Pennsylvania; Hebrew University of Jerusalem; Stanford University; Stanford University
摘要:The detection of local genomic signals using high-throughput DNA sequencing data can be cast as a problem of scanning a Poisson random field for local changes in the rate of the process. We propose a likelihood-based framework for such scans, and derive formulas for false positive rate control and power calculations. The framework can also accommodate modified processes that involve overdispersion. As a specific, detailed example, we consider the detection of insertions and deletions by paired...