Univariate nonparametric regression in the presence of auxiliary covariates
成果类型:
Article
署名作者:
Efromovich, S
署名单位:
University of New Mexico
刊物名称:
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION
ISSN/ISSBN:
0162-1459
DOI:
10.1198/016214505000000376
发表日期:
2005
页码:
1185-1201
关键词:
additive regression
摘要:
This article addresses the problem of finding a relationship between the univariate predictor and the response when regression errors, created in part by known auxiliary covariates, are too large for a reliable regression estimation. A typical example is a controlled random design experiment with a large number of covariates, where the statistician is interested in the effect of a particular covariate and this effect is blurred by a large regression noise created by other covariates. This article develops a theory of asymptotically optimal nonparametric univariate regression estimation in the presence of auxiliary covariates. Here optimality means mimicking the performance of an oracle that knows the effects of auxiliary covariates on the response. The asymptotic theory shows that such an optimal estimation is possible, and also explains how to evaluate the noise created by auxiliary covariates and how to develop an estimator for the interesting case of small sample sizes. The concept of modeling regression noise is well known in analysis of covariance (ANCOVA), and here it is applied in the optimal way to a nonparametric regression setting. A procedure for small sample sizes, denoised scattergram, is tested on simulated examples and a real dataset with 84 observations and 9 auxiliary covariates; the results justify the practical feasibility of the developed method. The method also allows a practitioner to visualize how a dataset would appear if the effects of auxiliary covariates were eliminated and to determine why an exhibited regression function has any given particular shape. Many practical recommendations (in particular, how to use known shape restrictions) are presented and discussed. The asymptotic theory, a numerical study, and analysis of a real dataset indicate that the proposed method of reducing the variance of regression errors created by auxiliary covariances is feasible, is easy to implement, and improves the likelihood of a meaningful regression analysis.