Optimal screening and discovery of sparse signals with applications to multistage high throughput studies
成果类型:
Article
署名作者:
Cai, T. Tony; Sun, Wenguang
署名单位:
University of Pennsylvania; University of Southern California
刊物名称:
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY
ISSN/ISSBN:
1369-7412
DOI:
10.1111/rssb.12171
发表日期:
2017
页码:
197-223
关键词:
false discovery
sample-size
NULL
PROPORTION
designs
ORACLE
tests
rates
摘要:
A common feature in large-scale scientific studies is that signals are sparse and it is desirable to narrow down significantly the focus to a much smaller subset in a sequential manner. We consider two related data screening problems: one is to find the smallest subset such that it virtually contains all signals and another is to find the largest subset such that it essentially contains only signals. These screening problems are closely connected to but distinct from the more conventional signal detection or multiple-testing problems. We develop phase transition diagrams to characterize the fundamental limits in simultaneous inference and derive data-driven screening procedures which control the error rates with near optimality properties. Applications in the context of multistage high throughput studies are discussed.