A reality check for data snooping
成果类型:
Article
署名作者:
White, H
署名单位:
University of California System; University of California San Diego
刊物名称:
ECONOMETRICA
ISSN/ISSBN:
0012-9682
DOI:
10.1111/1468-0262.00152
发表日期:
2000
页码:
1097-1126
关键词:
STATIONARY BOOTSTRAP
confidence-regions
structural-change
random-variables
model selection
tests
inference
econometrics
rules
摘要:
Data snooping occurs when a given set of data is used more than once for purposes of inference or model selection. When such data reuse occurs, there is always the possibility that any satisfactory results obtained may simply be due to chance rather than to any merit inherent in the method yielding the results. This problem is practically unavoidable in the analysis of time-series data, as typically only a single history measuring a given phenomenon of interest is available for analysis: It is widely acknowledged by empirical researchers that data snooping is a dangerous practice to be avoided, but in fact it is endemic. The main problem has been a lack of sufficiently simple practical methods capable of assessing the potential dangers of data snooping in a given situation. Our purpose here is to provide such methods by specifying a straightforward procedure for testing the null hypothesis that the best model encountered in a specification search has no predictive superiority over a given benchmark model. This permits data snooping to be undertaken with some degree of confidence that one will not mistake results that could have been generated by chance for genuinely good results.