Detecting simultaneous changepoints in multiple sequences

成果类型:
Article
署名作者:
Zhang, Nancy R.; Siegmund, David O.; Ji, Hanlee; Li, Jun Z.
署名单位:
Stanford University; Stanford University; University of Michigan System; University of Michigan
刊物名称:
BIOMETRIKA
ISSN/ISSBN:
0006-3444
DOI:
10.1093/biomet/asq025
发表日期:
2010
页码:
631645
关键词:
array cgh data statistical-analysis segmentation metaanalysis
摘要:
We discuss the detection of local signals that occur at the same location in multiple one-dimensional noisy sequences, with particular attention to relatively weak signals that may occur in only a fraction of the sequences. We propose simple scan and segmentation algorithms based on the sum of the chi-squared statistics for each individual sample, which is equivalent to the generalized likelihood ratio for a model where the errors in each sample are independent. The simple geometry of the statistic allows us to derive accurate analytic approximations to the significance level of such scans. The formulation of the model is motivated by the biological problem of detecting recurrent DNA copy number variants in multiple samples. We show using replicates and parent-child comparisons that pooling data across samples results in more accurate detection of copy number variants. We also apply the multisample segmentation algorithm to the analysis of a cohort of tumour samples containing complex nested and overlapping copy number aberrations, for which our method gives a sparse and intuitive cross-sample summary.