ASSESSING INFLUENCE IN VARIABLE SELECTION-PROBLEMS

成果类型:
Article
署名作者:
LEGER, C; ALTMAN, N
署名单位:
Cornell University
刊物名称:
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION
ISSN/ISSBN:
0162-1459
DOI:
10.2307/2290335
发表日期:
1993
页码:
547-556
关键词:
multiple-regression linear-regression cp
摘要:
Variable selection techniques are often used in combination with multiple linear regression to produce a parsimonious model that fits the data well. It is clearly undesirable for the final model to depend strongly on the inclusion of a few influential cases in the data set. This article discusses a measure of influence of single cases on the final model. based on a similar measure used in ordinary multiple regression. When variables are selected objectively, deletion of individual cases can strongly affect the choice of model. The influence of individual cases on the parameters of the selected model are often assessed as part of the model building process. However, such conditional measures fail to evaluate the influence of the cases on the variable selection process. Modern computing environments make it feasible to use an unconditional criterion to determine the influence of each case on the selection procedure. A number of examples are discussed to illustrate the differences between these approaches. Heuristics are developed to explain the examples. We conclude that, although the conditional approach gives valuable information about the selected model, the use of the unconditional approach can lead to greater insight about the influence of individual observations on the process of model selection.