-
作者:Mitani, Aya A.; Kaye, Elizabeth K.; Nelson, Kerrie P.
作者单位:University of Toronto; Boston University; Boston University
摘要:Periodontal disease is a serious gum infection impacting half of the U.S. adult population that may lead to loss of teeth. Using standard marginal models to study the association between patient-level predictors and tooth-level outcomes can lead to biased estimates because the independence assumption between the outcome (periodontal disease) and cluster size (number of teeth per patient) is violated. Specifically, the baseline number of teeth of a patient is informative. In this setting a clus...
-
作者:Farcomeni, Alession
作者单位:University of Rome Tor Vergata
摘要:We estimate the number of migrants and refugees that died while trying to enter the European Union, during a period of 25 years. Only a subset of at-tempts with at least one casualty are reported by at least one media source. In order to obtain the estimate, we propose a regression-extrapolation approach, for joint estimation of population size (here, the number of deadly individual or group attempts) and the sum of an accompanying trait (here, the number of deaths) over the population. The tr...
-
作者:Liu, Yiwen; Sun, Xiaoxiao; Zhong, Wenxuan; Li, Bing
作者单位:University of Arizona; University System of Georgia; University of Georgia; Pennsylvania Commonwealth System of Higher Education (PCSHE); Pennsylvania State University; Pennsylvania State University - University Park
摘要:Very often for the same scientific question, there may exist different techniques or experiments that measure the same numerical quantity. Historically, various methods have been developed to exploit the information within each type of data independently. However, statistical data fusion methods that could effectively integrate multisource data under a unified framework are lacking. In this paper we propose a novel data fusion method, called Bscaling, for integrating multisource data. Consider...
-
作者:Lock, Eric F.; Park, Jun Young; Hoadley, Katherine A.
作者单位:University of Minnesota System; University of Minnesota Twin Cities; University of Toronto; University of North Carolina; University of North Carolina Chapel Hill
摘要:Several modern applications require the integration of multiple large data matrices that have shared rows and/or columns. For example, cancer studies that integrate multiple omics platforms across multiple types of cancer, pan-omics pan-cancer analysis, have extended our knowledge of molecular heterogeneity beyond what was observed in single tumor and single platform studies. However, these studies have been limited by available statistical methodology. We propose a flexible approach to the si...
-
作者:Baugh, Samuel; Mckinnon, Karen
作者单位:University of California System; University of California Los Angeles; University of California System; University of California Los Angeles
摘要:The accurate quantification of changes in the heat content of the world's oceans is crucial for our understanding of the effects of increasing green-house gas concentrations. The Argo program, consisting of Lagrangian floats that measure vertical temperature profiles throughout the global ocean, has provided a wealth of data from which to estimate ocean heat content. How-ever, creating a globally consistent statistical model for ocean heat content remains challenging due to the need for a glob...
-
作者:Legramanti, Sirio; Rigon, Tommaso; Durante, Daniele; Dunson, David B.
作者单位:Bocconi University; Bocconi University; University of Milano-Bicocca; Duke University
摘要:Reliably learning group structures among nodes in network data is challenging in several applications. We are particularly motivated by studying covert networks that encode relationships among criminals. These data are subject to measurement errors, and exhibit a complex combination of an unknown number of core-periphery, assortative and disassortative structures that may unveil key architectures of the criminal organization. The coexistence of these noisy block patterns limits the reliability...
-
作者:Passino, Francesco Sanna; Turcotte, Melissa J. M.; Heard, Nicholas A.
作者单位:Imperial College London; Microsoft
摘要:Graph link prediction is an important task in cybersecurity: relationships between entities within a computer network, such as users interacting with computers or system libraries and the corresponding processes that use them, can provide key insights into adversary behaviour. Poisson matrix factorisation (PMF) is a popular model for link prediction in large networks, particularly useful for its scalability. In this article PMF is extended to include scenarios that are commonly encountered in ...
-
作者:Schafer, Toryn L. J.; Wikle, Christopher K.; Hooten, Mevin B.
作者单位:University of Missouri System; University of Missouri Columbia; United States Department of the Interior; United States Geological Survey; Colorado State University System; Colorado State University Fort Collins; Colorado State University System; Colorado State University Fort Collins; Colorado State University System; Colorado State University Fort Collins
摘要:Agent-based methods allow for defining simple rules that generate complex group behaviors. The governing rules of such models are typically set a priori, and parameters are tuned from observed behavior trajectories. Instead of making simplifying assumptions across all anticipated scenarios, inverse reinforcement learning provides inference on the short-term (local) rules governing long-term behavior policies by using properties of a Markov decision process. We use the computationally efficient...
-
作者:Wu, Jing; Ward, Owen G.; Curley, James; Zheng, Tian
作者单位:Columbia University; University of Texas System; University of Texas Austin
摘要:Modeling event dynamics is central to many disciplines. Patterns in observed social interaction events can be commonly modeled using point processes. Such social interaction event data often exhibit self-exciting, heterogeneous and sporadic trends which is challenging for conventional models. It is reasonable to assume that there exists a hidden state process that drives different event dynamics at different states. In this paper we propose a Markov modulated Hawkes process (MMHP) model for le...
-
作者:Zhang, Boya; Gramacy, Robert B.; Johnson, Leah R.; Rose, Kenneth A.; Smith, Eric
作者单位:Virginia Polytechnic Institute & State University; University System of Maryland; University of Maryland Center for Environmental Science
摘要:Delta smelt is an endangered fish species in the San Francisco estuary that have shown an overall population decline over the past 30 years. Researchers have developed a stochastic, agent-based simulator to virtualize the system with the goal of understanding the relative contribution of natural and anthropogenic factors that might play a role in their decline. However, the input configuration space is high dimensional, running the simulator is time-consuming, and its noisy outputs change nonl...