A two-way semilinear model for normalization and analysis of cDNA microarray data

成果类型:
Article
署名作者:
Huang, J; Wang, D; Zhang, CH
署名单位:
University of Iowa; University of Iowa; University of Alabama System; University of Alabama Birmingham; Rutgers University System; Rutgers University New Brunswick
刊物名称:
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION
ISSN/ISSBN:
0162-1459
DOI:
10.1198/016214504000002032
发表日期:
2005
页码:
814-829
关键词:
differentially expressed genes variance
摘要:
A basic question in analyzing cDNA microarray data is normalization, the purpose of which is to remove systematic bias in the observed expression values by establishing a normalization curve across the whole dynamic range. A proper normalization procedure ensures that the normalized intensity ratios provide meaningful measures of relative expression levels. We propose a two-way semilinear model (TW-SLM) for normalization and analysis of microarray data. This method does not make the usual assumptions underlying some of the existing methods. For example, it does not assume that the percentage of differentially expressed genes is small or that there is symmetry in the expression levels of up-regulated and down-regulated genes, as required in the lowess normalization method. The TW-SLM also naturally incorporates uncertainty due to normalization into significance analysis of microarrays. We use a semiparametric approach based on polynomial splines in the TW-SLM to estimate the normalization curves and the normalized expression values. We study the theoretical properties of the proposed estimator in the TW-SLM, including the finite-sample distributional properties of the estimated gene effects and the rate of convergence of the estimated normalization curves when the number of genes under study is large. We also conduct simulation studies to evaluate the TW-SLM method and illustrate the proposed method using a published microarray dataset.