ROBUST BAYESIAN INFERENCE FOR BIG DATA: COMBINING SENSOR-BASED RECORDS WITH TRADITIONAL SURVEY DATA

成果类型:
Article
署名作者:
Rafei, Ali; Flannagan, Carol A. C.; West, Brady T.; Elliott, Michael R.
署名单位:
University of Michigan System; University of Michigan; University of Michigan System; University of Michigan; University of Michigan System; University of Michigan
刊物名称:
ANNALS OF APPLIED STATISTICS
ISSN/ISSBN:
1932-6157
DOI:
10.1214/21-AOAS1531
发表日期:
2022
页码:
1038-1070
关键词:
propensity score Missing Data REGRESSION-COEFFICIENTS MODEL imputation nonresponse populations estimators BEHAVIOR DESIGN
摘要:
Big Data often presents as massive nonprobability samples. Not only is the selection mechanism often unknown but larger data volume amplifies the relative contribution of selection bias to total error. Existing bias adjustment approaches assume that the conditional mean structures have been correctly specified for the selection indicator or key substantive measures. In the presence of a reference probability sample, these methods rely on a pseudolike-lihood method to account for the sampling weights of the reference sample, which is parametric in nature. Under a Bayesian framework, handling the sampling weights is an even bigger hurdle. To further protect against model misspecification, we expand the idea of double robustness such that more flexible nonparametric methods as well as Bayesian models can be used for prediction. In particular, we employ Bayesian additive regression trees which not only capture nonlinear associations automatically but permit direct quantification of the uncertainty of point estimates through its posterior predictive draws. We apply our method to sensor-based naturalistic driving data from the second Strategic Highway Research Program using the 2017 National Household Travel Survey as a benchmark.
来源URL: