Miscellanea Bagging cross-validated bandwidths with application to big data

成果类型:
Article
署名作者:
Barreiro-Ures, D.; Cao, R.; Francisco-Fernandez, M.; Hart, J. D.
署名单位:
Universidade da Coruna; Texas A&M University System; Texas A&M University College Station
刊物名称:
BIOMETRIKA
ISSN/ISSBN:
0006-3444
DOI:
10.1093/biomet/asaa092
发表日期:
2021
页码:
981988
关键词:
DENSITY-ESTIMATION selection
摘要:
Hall & Robinson (2009) proposed and analysed the use of bagged cross-validation to choose the bandwidth of a kernel density estimator. They established that bagging greatly reduces the noise inherent in ordinary cross-validation, and hence leads to a more efficient bandwidth selector. The asymptotic theory of Hall & Robinson (2009) assumes that N, the number of bagged subsamples, is 8. We expand upon their theoretical results by allowing N to be finite, as it is in practice. Our results indicate an important difference in the rate of convergence of the bagged cross-validation bandwidth for the cases N = 8 and N < 8. Simulations quantify the improvement in statistical efficiency and computational speed that can result from using bagged cross-validation as opposed to a binned implementation of ordinary cross-validation. The performance of the bagged bandwidth is also illustrated on a real, very large, dataset. Finally, a byproduct of our study is the correction of errors appearing in the Hall & Robinson (2009) expression for the asymptotic mean squared error of the bagging selector.