BAYESIAN JOINT MODELING OF MULTIPLE GENE NETWORKS AND DIVERSE GENOMIC DATA TO IDENTIFY TARGET GENES OF A TRANSCRIPTION FACTOR

成果类型:
Article
署名作者:
Wei, Peng; Pan, Wei
署名单位:
University of Texas System; University of Texas Health Science Center Houston; University of Texas School Public Health; University of Texas System; University of Texas Health Science Center Houston; University of Texas School Public Health; University of Minnesota System; University of Minnesota Twin Cities
刊物名称:
ANNALS OF APPLIED STATISTICS
ISSN/ISSBN:
1932-6157
DOI:
10.1214/11-AOAS502
发表日期:
2012
页码:
334-355
关键词:
random-field model variable selection expression binding prediction ontology motifs
摘要:
We consider integrative modeling of multiple gene networks and diverse genomic data, including protein-DNA binding, gene expression and DNA sequence data, to accurately identify the regulatory target genes of a transcription factor (TF). Rather than treating all the genes equally and independently a priori in existing joint modeling approaches, we incorporate the biological prior knowledge that neighboring genes on a gene network tend to be (or not to be) regulated together by a TF. A key contribution of our work is that, to maximize the use of all existing biological knowledge, we allow incorporation of multiple gene networks into joint modeling of genomic data by introducing a mixture model based on the use of multiple Markov random fields (MRFs). Another important contribution of our work is to allow different genomic data to be correlated and to examine the validity and effect of the independence assumption as adopted in existing methods. Due to a fully Bayesian approach, inference about model parameters can be carried out based on MCMC samples. Application to an E. coli data set, together with simulation studies, demonstrates the utility and statistical efficiency gains with the proposed joint model.
来源URL: