A Unified Framework for Residual Diagnostics in Generalized Linear Models and Beyond

成果类型:
Article; Early Access
署名作者:
Liu, Dungang; Lin, Zewei; Zhang, Heping
署名单位:
University System of Ohio; University of Cincinnati; Texas State University System; Texas State University San Marcos; Yale University; Yale University; Yale University
刊物名称:
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION
ISSN/ISSBN:
0162-1459
DOI:
10.1080/01621459.2025.2504037
发表日期:
2025
关键词:
GOODNESS-OF-FIT deviance residuals regression tests association discrete
摘要:
Model diagnostics is an indispensable component in regression analysis, yet it has not been well addressed in generalized linear models (GLMs). When outcome data are discrete, classical Pearson and deviance residuals have limited utility in generating diagnostic insights. This article establishes a novel diagnostic framework for GLMs and their extensions. Unlike the convention of using a point statistic as a residual, we propose to use a function as a vehicle to retain residual information. In the presence of data discreteness, we show that such a functional residual is appropriate for summarizing the residual randomness that cannot be captured by the structural part of the model. We establish its theoretical properties, which lead to the innovation of new diagnostic tools including the functional-residual-vs-covariate plot and Function-to-Function plot (similar to a Quantile-Quantile plot). Our numerical studies demonstrate that the use of these tools can reveal a variety of model misspecifications, such as not properly including a higher-order term, an explanatory variable, an interaction effect, a dispersion parameter, or a zero-inflation component. As a general notion, the functional residual considerably broadens the diagnostic scope as it applies to GLMs for binary, ordinal and count data as well as semiparametric models (e.g., generalized additive models), all in a unified framework. Its functional form provides a way to unify point residuals such as Liu-Zhang's surrogate residual and Li-Shepherd's probability-scale residual. As its graphical outputs can be interpreted in a similar way to those for linear models, our framework also unifies diagnostic interpretation for discrete data and continuous data. Supplementary materials for this article are available online.