Selection-Corrected Statistical Inference for Region Detection With High-Throughput Assays
成果类型:
Article
署名作者:
Benjamini, Yuval; Taylor, Jonathan; Irizarry, Rafael A.
署名单位:
Hebrew University of Jerusalem; Stanford University; Harvard University; Harvard University Medical Affiliates; Dana-Farber Cancer Institute; Harvard University
刊物名称:
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION
ISSN/ISSBN:
0162-1459
DOI:
10.1080/01621459.2018.1498347
发表日期:
2019
页码:
1351-1365
关键词:
cluster-size inference
False Discovery Rate
dna methylation
confidence-intervals
local maxima
random-field
extent
peaks
array
摘要:
Scientists use high-dimensional measurement assays to detect and prioritize regions of strong signal in spatially organized domain. Examples include finding methylation-enriched genomic regions using microarrays, and active cortical areas using brain-imaging. The most common procedure for detecting potential regions is to group neighboring sites where the signal passed a threshold. However, one needs to account for the selection bias induced by this procedure to avoid diminishing effects when generalizing to a population. This article introduces pin-down inference, a model and an inference framework that permit population inference for these detected regions. Pin-down inference provides nonasymptotic point and confidence interval estimators for the mean effect in the region that account for local selection bias. Our estimators accommodate nonstationary covariances that are typical of these data, allowing researchers to better compare regions of different sizes and correlation structures. Inference is provided within a conditional one-parameter exponential family per region, with truncations that match the selection constraints. A secondary screening-and-adjustment step allows pruning the set of detected regions, while controlling the false-coverage rate over the reported regions. We apply the method to genomic regions with differing DNA-methylation rates across tissue. Our method provides superior power compared to other conditional and nonparametric approaches.