Robustness and accuracy of methods for high dimensional data analysis based on Student's t-statistic
成果类型:
Article
署名作者:
Delaigle, Aurore; Hall, Peter; Jin, Jiashun
署名单位:
University of Melbourne; University of California System; University of California Davis; Carnegie Mellon University
刊物名称:
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY
ISSN/ISSBN:
1369-7412
DOI:
10.1111/j.1467-9868.2010.00761.x
发表日期:
2011
页码:
283-301
关键词:
false discovery rate
MULTIPLE TEST PROCEDURES
bootstrap
approximations
CONVERGENCE
PROPORTION
errors
tests
rates
摘要:
Student's t-statistic is finding applications today that were never envisaged when it was introduced more than a century ago. Many of these applications rely on properties, e.g. robustness against heavy-tailed sampling distributions, that were not explicitly considered until relatively recently. We explore these features of the t-statistic in the context of its application to very high dimensional problems, including feature selection and ranking, the simultaneous testing of many different hypotheses and sparse, high dimensional signal detection. Robustness properties of the t-ratio are highlighted, and it is established that those properties are preserved under applications of the bootstrap. In particular, bootstrap methods correct for skewness and therefore lead to second-order accuracy, even in the extreme tails. Indeed, it is shown that the bootstrap and also the more popular but less accurate t-distribution and normal approximations are more effective in the tails than towards the middle of the distribution. These properties motivate new methods, e.g. bootstrap-based techniques for signal detection, that confine attention to the significant tail of a statistic.
来源URL: