High-Dimensional Sparse Factor Modeling: Applications in Gene Expression Genomics

成果类型:
Article
署名作者:
Carvalho, Carlos M.; Chang, Jeffrey; Lucas, Joseph E.; Nevins, Joseph R.; Wang, Quanli; West, Mike
署名单位:
University of Chicago; Duke University; Duke University; Duke University
刊物名称:
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION
ISSN/ISSBN:
0162-1459
DOI:
10.1198/016214508000000869
发表日期:
2008
页码:
1438-1456
关键词:
human breast-cancer variable selection cyclin d1 complexity mixture
摘要:
We describe Studies in molecular profiling and biological pathway analysis that use sparse latent factor and regression models for microarray gene expression data. We discuss breast cancer applications and key aspects of the modeling and computational methodology. Our case Studies aim to investigate and characterize heterogeneity of structure related to specific oncogenic pathways, its well as links between aggregate patterns in gene expression profiles and clinical biomarkers. Based on the metaphor of statistically derived factors as representing biological subpathway structure, we explore the decomposition of fitted sparse factor models into pathway subcomponents and investigate how these components overlay multiple aspects of known biological activity. Our methodology is based on sparsity modeling of multivariate regression, ANOVA, and latent factor models, as well as a class of models that combines all components. Hierarchical sparsity priors address questions of dimension reduction and multiple comparisons, as well its scalability of the methodology. The models include practically relevant non-Gaussian/nonparametric component,,. for latent structure. underlying often quite complex non-Gaussianity in multivariate expression patterns. Model search and fitting are addressed through stochastic simulation and evolutionary stochastic search methods that are exemplified in the oncogenic pathway Studies. Supplementary supporting material provides more details of the applications, its well as examples of the use of freely available software tools for implementing the methodology.