OPTIMAL CHANGE-POINT DETECTION AND LOCALIZATION
成果类型:
Article
署名作者:
Verzelen, Nicolas; Fromont, Magalie; Lerasle, Matthieu; Reynaud-Bouret, Patricia
署名单位:
INRAE; Universite de Rennes; Institut Polytechnique de Paris; ENSAE Paris; Universite Cote d'Azur
刊物名称:
ANNALS OF STATISTICS
ISSN/ISSBN:
0090-5364
DOI:
10.1214/23-AOS2297
发表日期:
2023
页码:
1586-1610
关键词:
least-squares estimation
multiple change-points
Minimax Rates
sequence
inference
cluster
SPARSE
number
摘要:
Given a times series Y in Rn, with a piecewise constant mean and independent components, the twin problems of change-point detection and change-point localization, respectively amount to detecting the existence of times where the mean varies and estimating the positions of those changepoints. In this work, we tightly characterize optimal rates for both problems and uncover the phase transition phenomenon from a global testing problem to a local estimation problem. Introducing a suitable definition of the energy of a change-point, we first establish in the single change-point setting that the optimal detection threshold is root 2 log log(n). When the energy is just above the detection threshold, then the problem of localizing the changepoint becomes purely parametric: it only depends on the difference in means and not on the position of the change-point anymore. Interestingly, for most change-point positions, including all those away from the endpoints of the time series, it is possible to detect and localize them at a much smaller energy level. In the multiple change-point setting, we establish the energy detection threshold and show similarly that the optimal localization error of a specific change-point becomes purely parametric. Along the way, tight minimax rates for Hausdorff and l1 estimation losses of the vector of all change-points positions are also established. Two procedures achieving these optimal rates are introduced. The first one is a least-squares estimator with a new multiscale penalty that favours well spread change-points. The second one is a two-step multiscale post-processing procedure whose computational complexity can be as low as O(n log(n)). Notably, these two procedures accommodate with the presence of possibly many low-energy and therefore undetectable changepoints and are still able to detect and localize high-energy change-points even with the presence of those nuisance parameters.
来源URL: