PHENOMENOLOGICAL FORECASTING OF DISEASE INCIDENCE USING HETEROSKEDASTIC GAUSSIAN PROCESSES: A DENGUE CASE STUDY
成果类型:
Article
署名作者:
Johnson, Leah R.; Gramacy, Robert B.; Cohen, Jeremy; Mordecai, Erin; Murdock, Courtney; Rohr, Jason; Ryan, Sadie J.; Stewart-Ibarra, Anna M.; Weikel, Daniel
署名单位:
Virginia Polytechnic Institute & State University; State University System of Florida; University of South Florida; Stanford University; University System of Georgia; University of Georgia; University System of Georgia; University of Georgia; State University System of Florida; University of Florida; State University System of Florida; University of Florida; State University of New York (SUNY) System; SUNY Upstate Medical University; State University of New York (SUNY) System; SUNY Upstate Medical University; University of Michigan System; University of Michigan
刊物名称:
ANNALS OF APPLIED STATISTICS
ISSN/ISSBN:
1932-6157
DOI:
10.1214/17-AOAS1090
发表日期:
2018
页码:
27-66
关键词:
temperature
transmission
uncertainty
epidemics
摘要:
In 2015 the US federal government sponsored a dengue forecasting competition using historical case data from Iquitos, Peru and San Juan, Puerto Rico. Competitors were evaluated on several aspects of out-of-sample forecasts including the targets of peak week, peak incidence during that week, and total season incidence across each of several seasons. Our team was one of the winners of that competition, outperforming other teams in multiple targets/locales. In this paper we report on our methodology, a large component of which, surprisingly, ignores the known biology of epidemics at large-for example, relationships between dengue transmission and environmental factors-and instead relies on flexible nonparametric nonlinear Gaussian process (GP) regression fits that memorize the trajectories of past seasons, and then match the dynamics of the unfolding season to past ones in real-time. Our phenomenological approach has advantages in situations where disease dynamics are less well understood, or where measurements and forecasts of ancillary covariates like precipitation are unavailable, and/or where the strength of association with cases are as yet unknown. In particular, we show that the GP approach generally outperforms a more classical generalized linear (autoregressive) model (GLM) that we developed to utilize abundant covariate information. We illustrate variations of our method(s) on the two benchmark locales alongside a full summary of results submitted by other contest competitors.
来源URL: