A Bayesian Model for Cross-Study Differential Gene Expression
成果类型:
Article
署名作者:
Scharpf, Robert B.; Tjelmeland, Hakon; Parmigiani, Giovanni; Nobel, Andrew B.
署名单位:
Norwegian University of Science & Technology (NTNU); Johns Hopkins University; Johns Hopkins Bloomberg School of Public Health; Johns Hopkins University; Johns Hopkins Medicine; University of North Carolina; University of North Carolina Chapel Hill
刊物名称:
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION
ISSN/ISSBN:
0162-1459
DOI:
10.1198/jasa.2009.ap07611
发表日期:
2009
页码:
1295-1310
关键词:
microarray data
molecular classification
Mixture Model
metaanalysis
profiles
adenocarcinoma
normalization
computation
validation
carcinomas
摘要:
In this article we define a hierarchical Bayesian model for microarray expression data collected from several studies and use it to identify genes that show differential expression between two conditions. Key features include shrinkage across both gene.; and studies, and flexible modeling that allows for interactions between platforms and the estimated effect, as well as concordant and discordant differential expression across studies. We evaluate the performance of our model in a comprehensive Fashion, using both artificial data, and a split-study validation approach that provides an agnostic assessment of the model's behavior under both the null hypothesis and a realistic alternative. The simulation results from the artificial data demonstrate the advantages of the Bayesian model. Furthermore, the simulations provide guidelines for when the Bayesian model is most likely to be useful. Most notably, in small studies the Bayesian model generally outperforms other methods when evaluated based on several performance measures across a range of simulation parameters, with the differences diminishing for larger sample sizes in the individual Studies. The split-study validation illustrates appropriate shrinkage of the Bayesian model in the absence of platform, sample, and annotation differences that otherwise complicate experimental data analyses. Finally, we fit our model to four breast cancer studies using different technologies (cDNA and Affymetrix) to estimate differential expression in estrogen receptor-positive tumors versus estrogen receptor-negative tumors. Software and data for reproducing our analysis are available publicly.