-
作者:Marchetti-Bowick, Micol; Yu, Yaoliang; Wu, Wei; Xing, Eric P.
作者单位:Carnegie Mellon University; University of Waterloo
摘要:In this work, we present a new approach for jointly performing eQTL mapping and gene network inference while encouraging a transfer of information between the two tasks. We address this problem by formulating it as a multiple-output regression task in which we aim to learn the regression coefficients while simultaneously estimating the conditional independence relationships among the set of response variables. The approach we develop uses structured sparsity penalties to encourage the sharing ...
-
作者:Zhang, Ningshan; Schmaus, Kyle; Perry, Patrick O.
作者单位:New York University
摘要:We consider a particular instance of a common problem in recommender systems, using a database of book reviews to inform user-targeted recommendations. In our dataset, books are categorized into genres and subgenres. To exploit this nested taxonomy, we use a hierarchical model that enables information pooling across across similar items at many levels within the genre hierarchy. The main challenge in deploying this model is computational. The data sizes are large and fitting the model at scale...
-
作者:Katsevich, Eugene; Sabatti, Chiara
作者单位:Stanford University
摘要:We tackle the problem of selecting from among a large number of variables those that are important for an outcome. We consider situations where groups of variables are also of interest. For example, each variable might be a genetic polymorphism, and we might want to study how a trait depends on variability in genes, segments of DNA that typically contain multiple such polymorphisms. In this context, to discover that a variable is relevant for the outcome implies discovering that the larger ent...
-
作者:Berg, Stephen; Zhu, Jun; Clayton, Murray K.; Shea, Monika E.; Mladenoff, David J.
作者单位:University of Wisconsin System; University of Wisconsin Madison; University of Wisconsin System; University of Wisconsin Madison
摘要:The Wisconsin Public Land Survey database describes historical forest composition at high spatial resolution and is of interest in ecological studies of forest composition in Wisconsin just prior to significant Euro-American settlement. For such studies it is useful to identify recurring subpopulations of tree species known as communities, but standard clustering approaches for subpopulation identification do not account for dependence between spatially nearby observations. Here, we develop an...
-
作者:Liang, Kun
作者单位:University of Waterloo
摘要:Finding differentially expressed genes is a common task in high-throughput transcriptome studies. While traditional statistical methods rank the genes by their test statistics alone, we analyze an RNA sequencing dataset using the auxiliary information of gene length and the test statistics from a related microarray study. Given the auxiliary information, we propose a novel nonparametric empirical Bayes procedure to estimate the posterior probability of differential expression for each gene. We...
-
作者:Zhang, Hongbin; Wu, Lang
作者单位:City University of New York (CUNY) System; University of British Columbia
摘要:For a time-to-event outcome with censored time-varying covariates, a joint Cox model with a linear mixed effects model is the standard modeling approach. In some applications such as AIDS studies, mechanistic nonlinear models are available for some covariate process such as viral load during anti-HIV treatments, derived from the underlying data-generation mechanisms and disease progression. Such a mechanistic nonlinear covariate model may provide better-predicted values when the covariates are...
-
作者:Dobra, Adrian; Valdes, Camilo; Ajdic, Dragana; Clarke, Bertrand; Clarke, Jennifer
作者单位:University of Washington; University of Washington Seattle; State University System of Florida; Florida International University; University of Miami; University of Miami; University of Nebraska System; University of Nebraska Lincoln
摘要:There is a growing awareness of the important roles that microbial communities play in complex biological processes. Modern investigation of these often uses next generation sequencing of metagenomic samples to determine community composition. We propose a statistical technique based on clique loglinear models and Bayes model averaging to identify microbial components in a metagenomic sample at various taxonomic levels that have significant associations. We describe the model class, a stochast...
-
作者:Huang, Yen-Ning; Reich, Brian J.; Fuentes, Montserrat; Sankarasubramanian, A.
作者单位:North Carolina State University; Virginia Commonwealth University; North Carolina State University
摘要:Computer simulation models are central to environmental science. These mathematical models are used to understand complex weather and climate patterns and to predict the climate's response to different forcings. Climate models are of course not perfect reflections of reality, and so comparison with observed data is needed to quantify and to correct for biases and other deficiencies. We propose a new method to calibrate model output using observed data. Our approach not only matches the margina...
-
作者:Huo, Zhiguang; Song, Chi; Tseng, George
作者单位:State University System of Florida; University of Florida; University System of Ohio; Ohio State University; Pennsylvania Commonwealth System of Higher Education (PCSHE); University of Pittsburgh
摘要:Due to the rapid development of high-throughput experimental techniques and fast-dropping prices, many transcriptomic datasets have been generated and accumulated in the public domain. Meta-analysis combining multiple transcriptomic studies can increase the statistical power to detect disease-related biomarkers. In this paper we introduce a Bayesian latent hierarchical model to perform transcriptomic meta-analysis. This method is capable of detecting genes that are differentially expressed (DE...
-
作者:Regier, Jeffrey; Miller, Andrew C.; Schlegel, David; Adams, Ryan P.; McAuliffe, Jon D.; Prabhat
作者单位:University of California System; University of California Berkeley; University of California System; University of California Berkeley; Columbia University; United States Department of Energy (DOE); Lawrence Berkeley National Laboratory; Princeton University
摘要:We present a new, fully generative model for constructing astronomical catalogs from optical telescope image sets. Each pixel intensity is treated as a random variable with parameters that depend on the latent properties of stars and galaxies. These latent properties are themselves modeled as random. We compare two procedures for posterior inference. One procedure is based on Markov chain Monte Carlo (MCMC) while the other is based on variational inference (VI). The MCMC procedure excels at qu...