Large-scale multiple testing under dependence

成果类型:
Article
署名作者:
Sun, Wenguang; Cai, T. Tony
署名单位:
University of Pennsylvania
刊物名称:
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY
ISSN/ISSBN:
1369-7412
DOI:
10.1111/j.1467-9868.2008.00694.x
发表日期:
2009
页码:
393-424
关键词:
false discovery rate maximum-likelihood-estimation compound decision rules hidden markov-models gene-expression probabilistic functions order estimation estimator number rates
摘要:
The paper considers the problem of multiple testing under dependence in a compound decision theoretic framework. The observed data are assumed to be generated from an underlying two-state hidden Markov model. We propose oracle and asymptotically optimal data-driven procedures that aim to minimize the false non-discovery rate FNR subject to a constraint on the false discovery rate FDR. It is shown that the performance of a multiple-testing procedure can be substantially improved by adaptively exploiting the dependence structure among hypotheses, and hence conventional FDR procedures that ignore this structural information are inefficient. Both theoretical properties and numerical performances of the procedures proposed are investigated. It is shown that the procedures proposed control FDR at the desired level, enjoy certain optimality properties and are especially powerful in identifying clustered non-null cases. The new procedure is applied to an influenza-like illness surveillance study for detecting the timing of epidemic periods.