A model-based background adjustment for oligonucleotide expression arrays
成果类型:
Article
署名作者:
Wu, ZJ; Irizarry, RA; Gentleman, R; Martinez-Murillo, F; Spencer, F
署名单位:
Johns Hopkins University; Johns Hopkins University; Johns Hopkins University; Harvard University; Harvard University Medical Affiliates; Dana-Farber Cancer Institute
刊物名称:
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION
ISSN/ISSBN:
0162-1459
DOI:
10.1198/016214504000000683
发表日期:
2004
页码:
909-917
关键词:
probe level data
microarray data
normalization
summaries
variance
摘要:
High-density oligonucleotide expression arrays are widely used in many areas of biomedical research. Affymetrix GeneChip arrays are the most popular, In the Affymetrix system. a fair amount of further preprocessing and data reduction occurs after the image-processing step. Statistical procedures developed by academic groups have been successful in improving the default algorithms provided by the Affymetrix system. In this article we present a solution to one of the preprocessing steps-background adjustment-based on a formal statistical framework. Our Solution greatly improves the performance of the technology in various practical applications. These arrays use short oligonucleotides to probe for genes in an RNA sample. Typically, each gene is represented by 11-20 pairs of oligonucleotide probes. The first component of these pairs is referred to as a perfect match probe and is designed to hybridize only with transcripts from the intended gene (i.e.. specific hybridization). However, hybridization by other sequences (i.e., nonspecific hybridization) is unavoidable. Furthermore. hybridization strengths are measured by a scanner that introduces optical noise. Therefore, the observed intensities need to be adjusted to give accurate measurements of specific hybridization. We have found that the default ad hoc adjustment, provided as part of the Affymetrix system can be improved through the use of estimators derived from a statistical model that uses probe sequence information. A final step in preprocessing is to summarize the probe-level data for each gene to define a measure of expression that represents the amount of the corresponding mRNA species. In this article we illustrate the practical consequences of not adjusting appropriately for the presence of nonspecific hybridization and provide a solution based on our background adjustment procedure. Software that computes our adjustment is available as part of the Bioconductor Project (http://bioconductor.org).
来源URL: