您的位置: 首页 > 全球经管学术 > 顶刊追踪 > 顶尖期刊 > 统计学 > The Annals of Statistics > 2022 > 6期

TESTING FOR THE RANK OF A COVARIANCE OPERATOR

成果类型：

Article

署名作者：

Charkaborty, Anirvan; Panaretos, Victor M.

署名单位：

Indian Institute of Science Education & Research (IISER) - Kolkata; Swiss Federal Institutes of Technology Domain; Ecole Polytechnique Federale de Lausanne

刊物名称：

ANNALS OF STATISTICS

ISSN/ISSBN：

0090-5364

DOI：

10.1214/22-AOS2238

发表日期：

2022

页码：

3510-3537

关键词：

PRINCIPAL COMPONENTS finite dimensionality number

摘要：

How can we discern whether the covariance operator of a stochastic pro-cess is of reduced rank, and if so, what its precise rank is? And how can we do so at a given level of confidence? This question is central to a great deal of methods for functional data, which require low-dimensional representa-tions whether by functional PCA or other methods. The difficulty is that the determination is to be made on the basis of i.i.d. replications of the process observed discretely and with measurement error contamination. This adds a ridge to the empirical covariance, obfuscating the underlying dimension. We build a matrix-completion inspired test statistic that circumvents this issue by measuring the best possible least square fit of the empirical covariance's off -diagonal elements, optimised over covariances of given finite rank. For a fixed grid of sufficiently large size, we determine the statistic's asymptotic null dis-tribution as the number of replications grows. We then use it to construct a bootstrap implementation of a stepwise testing procedure controlling the fam-ilywise error rate corresponding to the collection of hypotheses formalising the question at hand. Under minimal regularity assumptions, we prove that the procedure is consistent and that its bootstrap implementation is valid. The procedure circumvents smoothing and associated smoothing parameters, is indifferent to measurement error heteroskedasticity, and does not assume a low-noise regime. An extensive simulation study reveals an excellent practi-cal performance, stably across a wide range of settings and the procedure is further illustrated by means of two data analyses.