您的位置: 首页 > 全球经管学术 > 顶刊追踪 > 顶尖期刊 > 统计学 > The Annals of Applied Statistics > 2025 > 1期

PREDICTING CENSUS SURVEY RESPONSE RATES WITH PARSIMONIOUS ADDITIVE MODELS AND STRUCTURED INTERACTIONS

成果类型：

Article

署名作者：

Ibrahim, Shibal; Radchenko, Peter; Ben-David, Emanuel; Mazumder, Rahul

署名单位：

Massachusetts Institute of Technology (MIT); University of Sydney; Massachusetts Institute of Technology (MIT)

刊物名称：

ANNALS OF APPLIED STATISTICS

ISSN/ISSBN：

1932-6157

DOI：

10.1214/24-AOAS1929

发表日期：

2025

页码：

94-120

关键词：

VARIABLE SELECTION linear-models regression Lasso

摘要：

In this paper we consider the problem of predicting survey response rates using a family of flexible and interpretable nonparametric models. The study is motivated by the U.S. Census Bureau's well-known ROAM application, which uses a linear regression model trained on the U.S. Census Planning Database data to identify hard-to-survey areas. A crowdsourcing competition (Public Opin. Q. 81 (2016) 144-156) organized more than 10 years ago revealed that machine learning methods, based on ensembles of regression trees, led to the best performance in predicting survey response rates; however, the corresponding models could not be adopted for the intended application due to their black-box nature. We consider nonparametric additive models with a small number of main and pairwise interaction effects using & ell;0-based penalization. From a methodological viewpoint, we study our estimator's computational and statistical aspects and discuss variants incorporating strong hierarchical interactions. Our algorithms (open-sourced on GitHub) extend the computational frontiers of existing algorithms for sparse additive models to be able to handle datasets relevant to the application we consider. We discuss and interpret findings from our model on the U.S. Census Planning Database. In addition to being useful from an interpretability standpoint, our models lead to predictions comparable to popular black-box machine learning methods based on gradient boosting and feedforward neural networks-suggesting that it is possible to have models that have the best of both worlds, good model accuracy and interpretability.

来源URL：

访问原文