-
作者:Rusch, Thomas; Lee, Ilro; Hornik, Kurt; Jank, Wolfgang; Zeileis, Achim
作者单位:Vienna University of Economics & Business; Vienna University of Economics & Business; University of New South Wales Sydney; State University System of Florida; University of South Florida; University of Innsbruck
摘要:In political campaigning substantial resources are spent on voter mobilization, that is, on identifying and influencing as many people as possible to vote. Campaigns use statistical tools for deciding whom to target (microtargeting). In this paper we describe a nonpartisan campaign that aims at increasing overall turnout using the example of the 2004 US presidential election. Based on a real data set of 19,634 eligible voters from Ohio, we introduce a modern statistical framework well suited f...
-
作者:Chandler, Richard B.; Royle, J. Andrew
作者单位:United States Department of the Interior; United States Geological Survey
摘要:Recently developed spatial capture-recapture (SCR) models represent a major advance over traditional capture-recapture (CR) models because they yield explicit estimates of animal density instead of population size within an unknown area. Furthermore, unlike nonspatial CR methods, SCR models account for heterogeneity in capture probability arising from the juxtaposition of animal activity centers and sample locations. Although the utility of SCR methods is gaining recognition, the requirement t...
-
作者:Gramacy, Robert B.; Taddy, Matt; Wild, Stefan M.
作者单位:University of Chicago; United States Department of Energy (DOE); Argonne National Laboratory; University of Chicago; University of Chicago
摘要:We investigate an application in the automatic tuning of computer codes, an area of research that has come to prominence alongside the recent rise of distributed scientific processing and heterogeneity in high-performance computing environments. Here, the response function is nonlinear and noisy and may not be smooth or stationary. Clearly needed are variable selection, decomposition of influence, and analysis of main and secondary effects for both real-valued and binary inputs and outputs. Ou...
-
作者:Gaydos, Travis L.; Heckman, Nancy E.; Kirkpatrick, Mark; Stinchcombe, J. R.; Schmitt, Johanna; Kingsolver, Joel; Marron, J. S.
作者单位:MITRE Corporation; University of British Columbia; University of Texas System; University of Texas Austin; University of Toronto; University of California System; University of California Davis; University of North Carolina; University of North Carolina Chapel Hill; University of North Carolina; University of North Carolina Chapel Hill
摘要:Principal Components Analysis (PCA) is a common way to study the sources of variation in a high-dimensional data set. Typically, the leading principal components are used to understand the variation in the data or to reduce the dimension of the data for subsequent analysis. The remaining principal components are ignored since they explain little of the variation in the data. However, evolutionary biologists gain important insights from these low variation directions. Specifically, they are int...
-
作者:Kleiber, William; Sain, Stephan R.; Heaton, Matthew J.; Wiltberger, Michael; Reese, C. Shane; Bingham, Derek
作者单位:University of Colorado System; University of Colorado Boulder; Brigham Young University; Brigham Young University; Simon Fraser University
摘要:Geomagnetic storms play a critical role in space weather physics with the potential for far reaching economic impacts including power grid outages, air traffic rerouting, satellite damage and GPS disruption. The LFM-MIX is a state-of-the-art coupled magnetospheric-ionospheric model capable of simulating geomagnetic storms. Imbedded in this model are physical equations for turning the magnetohydrodynamic state parameters into energy and flux of electrons entering the ionosphere, involving a set...
-
作者:Stein, Michael L.; Chen, Jie; Anitescu, Mihai
作者单位:University of Chicago; United States Department of Energy (DOE); Argonne National Laboratory
摘要:We discuss the statistical properties of a recently introduced unbiased stochastic approximation to the score equations for maximum likelihood calculation for Gaussian processes. Under certain conditions, including bounded condition number of the covariance matrix, the approach achieves O(n) storage and nearly O(n) computational effort per optimization step, where n is the number of data sites. Here, we prove that if the condition number of the covariance matrix is bounded, then the approximat...
-
作者:Shen, Ronglai; Wang, Sijian; Mo, Qianxing
作者单位:Memorial Sloan Kettering Cancer Center; University of Wisconsin System; University of Wisconsin Madison; University of Wisconsin System; University of Wisconsin Madison; Baylor College of Medicine
摘要:High resolution microarrays and second-generation sequencing platforms are powerful tools to investigate genome-wide alterations in DNA copy number, methylation and gene expression associated with a disease. An integrated genomic profiling approach measures multiple omics data types simultaneously in the same set of biological samples. Such approach renders an integrated data resolution that would not be available with any single data type. In this study, we use penalized latent variable regre...
-
作者:Airoldi, Edoardo M.; Wang, Xiaopei; Lin, Xiaodong
作者单位:Harvard University; University System of Ohio; University of Cincinnati; Rutgers University System; Rutgers University New Brunswick
摘要:We consider the problem of quantifying temporal coordination between multiple high-dimensional responses. We introduce a family of multi-way stochastic blockmodels suited for this problem, which avoids preprocessing steps such as binning and thresholding commonly adopted for this type of data, in biology. We develop two inference procedures based on collapsed Gibbs sampling and variational methods. We provide a thorough evaluation of the proposed methods on simulated data, in terms of membersh...
-
作者:Crossett, Andrew; Lee, Ann B.; Klei, Lambertus; Devlin, Bernie; Roeder, Kathryn
作者单位:Pennsylvania State System of Higher Education (PASSHE); West Chester University of Pennsylvania; Carnegie Mellon University; Pennsylvania Commonwealth System of Higher Education (PCSHE); University of Pittsburgh
摘要:Recent technological advances coupled with large sample sets have uncovered many factors underlying the genetic basis of traits and the predis-position to complex disease, but much is left to discover. A common thread to most genetic investigations is familial relationships. Close relatives can be identified from family records, and more distant relatives can be inferred from large panels of genetic markers. Unfortunately these empirical estimates can be noisy, especially regarding distant rel...
-
作者:Handorf, Elizabeth A.; Bekelman, Justin E.; Heitjan, Daniel F.; Mitra, Nandita
作者单位:Pennsylvania Commonwealth System of Higher Education (PCSHE); Temple University; Fox Chase Cancer Center; University of Pennsylvania; University of Pennsylvania
摘要:Estimates of the effects of treatment on cost from observational studies are subject to bias if there are unmeasured confounders. It is therefore advisable in practice to assess the potential magnitude of such biases. We derive a general adjustment formula for loglinear models of mean cost and explore special cases under plausible assumptions about the distribution of the unmeasured confounder. We assess the performance of the adjustment by simulation, in particular, examining robustness to a ...