Missing values handling for machine learning portfolios
成果类型:
Article
署名作者:
Chen, Andrew Y.; McCoy, Jack
署名单位:
Federal Reserve System - USA; Federal Reserve System Board of Governors; Columbia University
刊物名称:
JOURNAL OF FINANCIAL ECONOMICS
ISSN/ISSBN:
0304-405X
DOI:
10.1016/j.jfineco.2024.103815
发表日期:
2024
关键词:
Stock market predictability
Stock market anomalies
Missing values
Machine Learning
摘要:
We characterize the structure and origins of missingness for 159 cross-sectional return predictors and study missing value handling for portfolios constructed using machine learning. Simply imputing with cross-sectional means performs well compared to rigorous expectation -maximization methods. This stems from three facts about predictor data: (1) missingness occurs in large blocks organized by time, (2) cross-sectional correlations are small, and (3) missingness tends to occur in blocks organized by the underlying data source. As a result, observed data provide little information about missing data. Sophisticated imputations introduce estimation noise that can lead to underperformance if machine learning is not carefully applied.