-
作者:Kipnis, Alon
作者单位:Stanford University
摘要:We adapt the higher criticism (HC) goodness-of-fit test to measure the closeness between word-frequency tables. We apply this measure to authorship attribution challenges, where the goal is to identify the author of a document using other documents whose authorship is known. The method is simple yet performs well without handcrafting and tuning, reporting accuracy at the state-of-the-art level in various current challenges. As an inherent side effect, the HC calculation identifies a subset of ...
-
作者:Rahman, Tanbin; Huang, Hsin-En; Li, Yujia; Tai, An-Shun; Hseih, Wen-Ping; McClung, Colleen A.; Tseng, George
作者单位:Pennsylvania Commonwealth System of Higher Education (PCSHE); University of Pittsburgh; National Tsing Hua University; Pennsylvania Commonwealth System of Higher Education (PCSHE); University of Pittsburgh
摘要:Supervised machine learning methods have been increasingly used in biomedical research and clinical practice. In transcriptomic applications, RNA-seq data have become dominating and have gradually replaced traditional microarray, due to their reduced background noise and increased digital precision. Most existing machine learning methods are, however, designed for continuous intensities of microarray and are not suitable for RNA-seq count data. In this paper we develop a negative binomial mode...
-
作者:Schafer, Toryn L. J.; Wikle, Christopher K.; Hooten, Mevin B.
作者单位:University of Missouri System; University of Missouri Columbia; United States Department of the Interior; United States Geological Survey; Colorado State University System; Colorado State University Fort Collins; Colorado State University System; Colorado State University Fort Collins; Colorado State University System; Colorado State University Fort Collins
摘要:Agent-based methods allow for defining simple rules that generate complex group behaviors. The governing rules of such models are typically set a priori, and parameters are tuned from observed behavior trajectories. Instead of making simplifying assumptions across all anticipated scenarios, inverse reinforcement learning provides inference on the short-term (local) rules governing long-term behavior policies by using properties of a Markov decision process. We use the computationally efficient...
-
作者:Wu, Jing; Ward, Owen G.; Curley, James; Zheng, Tian
作者单位:Columbia University; University of Texas System; University of Texas Austin
摘要:Modeling event dynamics is central to many disciplines. Patterns in observed social interaction events can be commonly modeled using point processes. Such social interaction event data often exhibit self-exciting, heterogeneous and sporadic trends which is challenging for conventional models. It is reasonable to assume that there exists a hidden state process that drives different event dynamics at different states. In this paper we propose a Markov modulated Hawkes process (MMHP) model for le...
-
作者:Zhang, Boya; Gramacy, Robert B.; Johnson, Leah R.; Rose, Kenneth A.; Smith, Eric
作者单位:Virginia Polytechnic Institute & State University; University System of Maryland; University of Maryland Center for Environmental Science
摘要:Delta smelt is an endangered fish species in the San Francisco estuary that have shown an overall population decline over the past 30 years. Researchers have developed a stochastic, agent-based simulator to virtualize the system with the goal of understanding the relative contribution of natural and anthropogenic factors that might play a role in their decline. However, the input configuration space is high dimensional, running the simulator is time-consuming, and its noisy outputs change nonl...
-
作者:Chkrebtii, Oksana A.; Garcia, Yury E.; Capistran, Marcos A.; Noyola, Daniel E.
作者单位:University System of Ohio; Ohio State University; CIMAT - Centro de Investigacion en Matematicas; Universidad Autonoma de San Luis Potosi
摘要:Before the current pandemic, influenza and respiratory syncytial virus (RSV) were the leading etiological agents of seasonal acute respiratory infections (ARI) around the world. In this setting, medical doctors typically based the diagnosis of ARI on patients' symptoms alone and did not routinely conduct virological tests necessary to identify individual viruses, limiting the ability to study the interaction between multiple pathogens and to make public health recommendations. We consider a st...
-
作者:Guo, Xiaoyang; Bal, Aditi Basu; Needham, Tom; Srivastava, Anuj
作者单位:State University System of Florida; Florida State University; State University System of Florida; Florida State University
摘要:The arterial networks in the human brain, termed brain arterial networks or BANs, are complex arrangements of individual arteries, branching patterns, and interconnectivity. BANs play an essential role in characterizing and understanding brain physiology, and one would like tools for statistically analyzing the shapes of BANs. These tools include quantifying shape differences, comparing populations of subjects, and studying the effects of covariates on these shapes. This paper mathematically r...
-
作者:Rafei, Ali; Flannagan, Carol A. C.; West, Brady T.; Elliott, Michael R.
作者单位:University of Michigan System; University of Michigan; University of Michigan System; University of Michigan; University of Michigan System; University of Michigan
摘要:Big Data often presents as massive nonprobability samples. Not only is the selection mechanism often unknown but larger data volume amplifies the relative contribution of selection bias to total error. Existing bias adjustment approaches assume that the conditional mean structures have been correctly specified for the selection indicator or key substantive measures. In the presence of a reference probability sample, these methods rely on a pseudolike-lihood method to account for the sampling w...
-
作者:Aliverti, Emanuele; Dunson, David B.
作者单位:Universita Ca Foscari Venezia; Duke University
摘要:Psychiatric studies of suicide provide fundamental insights on the evolution of severe psychopathologies, and contribute to the development of early treatment interventions. Our focus is on modelling different traits of psychosis and their interconnections, focusing on a case study on suicide attempt survivors. Such aspects are recorded via multivariate categorical data, involving a large numbers of items for multiple subjects. Current methods for multivariate categorical data-such as penalize...
-
作者:Wilson, Ander; Hsu, Hsiao-Hsien Leon; Chiu, Yueh-Hsiu Mathilda; Wright, Robert O.; Wright, Rosalind J.; Coull, Brent A.
作者单位:Colorado State University System; Colorado State University Fort Collins; Icahn School of Medicine at Mount Sinai; Harvard University; Harvard T.H. Chan School of Public Health
摘要:Exposures to environmental chemicals during gestation can alter health status later in life. Most studies of maternal exposure to chemicals during pregnancy have focused on a single chemical exposure observed at high temporal resolution. Recent research has turned to focus on exposure to mixtures of multiple chemicals, generally observed at a single time point. We consider statistical methods for analyzing data on chemical mixtures that are observed at a high temporal resolution. As motivation...