More efficient approximation of smoothing splines via space-filling basis selection
成果类型:
Article
署名作者:
Meng, Cheng; Zhang, Xinlian; Zhang, Jingyi; Zhong, Wenxuan; Ma, Ping
署名单位:
University System of Georgia; University of Georgia
刊物名称:
BIOMETRIKA
ISSN/ISSBN:
0006-3444
DOI:
10.1093/biomet/asaa019
发表日期:
2020
页码:
723735
关键词:
regression
computation
摘要:
We consider the problem of approximating smoothing spline estimators in a nonparametric regression model. When applied to a sample of size n, the smoothing spline estimator can be expressed as a linear combination of n basis functions, requiring O(n(3)) computational time when the number d of predictors is two or more. Such a sizeable computational cost hinders the broad applicability of smoothing splines. In practice, the full-sample smoothing spline estimator can be approximated by an estimator based on q randomly selected basis functions, resulting in a computational cost of O(nq(2)). It is known that these two estimators converge at the same rate when q is of order O{n(2/(pr+1))}, where p is an element of [1, 2] depends on the true function and r > 1 depends on the type of spline. Such a q is called the essential number of basis functions. In this article, we develop a more efficient basis selection method. By selecting basis functions corresponding to approximately equally spaced observations, the proposed method chooses a set of basis functions with great diversity. The asymptotic analysis shows that the proposed smoothing spline estimator can decrease q to around O{n(1/(pr+1))} when d <= pr + 1. Applications to synthetic and real-world datasets show that the proposed method leads to a smaller prediction error than other basis selection methods.