您的位置: 首页 > 全球经管学术 > 顶刊追踪 > 顶尖期刊 > 统计学 > Journal of the Royal Statistical Society: Series B > 2020 > 3期

On bandwidth choice for spatial data density estimation

成果类型：

Article

署名作者：

Jiang, Zhenyu; Ling, Nengxiang; Lu, Zudi; Tjostheim, Dag; Zhang, Qiang

署名单位：

University of Southampton; Hefei University of Technology; University of Bergen; Beijing University of Chemical Technology

刊物名称：

JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY

ISSN/ISSBN：

1369-7412

DOI：

10.1111/rssb.12367

发表日期：

2020

页码：

817-840

关键词：

cross-validation Nonparametric Regression asymptotic-behavior selection

摘要：

Bandwidth choice is crucial in spatial kernel estimation in exploring non-Gaussian complex spatial data. The paper investigates the choice of adaptive and non-adaptive bandwidths for density estimation given data on a spatial lattice. An adaptive bandwidth depends on local data and hence adaptively conforms with local features of the spatial data. We propose a spatial cross-validation (SCV) choice of a global bandwidth. This is done first with a pilot density involved in the expression for the adaptive bandwidth. The optimality of the procedure is established, and it is shown that a non-adaptive bandwidth choice comes out as a special case. Although the cross-validation idea has been popular for choosing a non-adaptive bandwidth in data-driven smoothing of independent and time series data, its theory and application have not been much investigated for spatial data. For the adaptive case, there is little theory even for independent data. Conditions that ensure asymptotic optimality of the SCV-selected bandwidth are derived, actually, also extending time series and independent data optimality results. Further, for the adaptive bandwidth with an estimated pilot density, oracle properties of the resultant density estimator are obtained asymptotically as if the true pilot were known. Numerical simulations show that finite sample performance of the SCV adaptive bandwidth choice works quite well. It outperforms the existing R routines such as the 'rule of thumb' and the so-called 'second-generation' Sheather-Jones bandwidths for moderate and big data sets. An empirical application to a set of spatial soil data is further implemented with non-Gaussian features significantly identified.