DISTRIBUTION AND QUANTILE FUNCTIONS, RANKS AND SIGNS IN DIMENSION d: A MEASURE TRANSPORTATION APPROACH

成果类型:
Article
署名作者:
Hallin, Marc; Del Barrio, Eustasio; Cuesta-Albertos, Juan; Matran, Carlos
署名单位:
Universite Libre de Bruxelles; Universite Libre de Bruxelles; Universidad de Valladolid; Universidad de Valladolid; Universidad de Cantabria
刊物名称:
ANNALS OF STATISTICS
ISSN/ISSBN:
0090-5364
DOI:
10.1214/20-AOS1996
发表日期:
2021
页码:
1139-1165
关键词:
multivariate quantiles optimal tests r-estimation order tests inference depth principal location shape interdirections
摘要:
Unlike the real line, the real space R-d, for d >= 2, is not canonically ordered. As a consequence, such fundamental univariate concepts as quantile and distribution functions and their empirical counterparts, involving ranks and signs, do not canonically extend to the multivariate context. Palliating that lack of a canonical ordering has been an open problem for more than half a century, generating an abundant literature and motivating, among others, the development of statistical depth and copula-based methods. We show that, unlike the many definitions proposed in the literature, the measure transportation-based ranks and signs introduced in Chernozhukov, Galichon, Hallin and Henry (Ann. Statist. 45 (2017) 223-256) enjoy all the properties that make univariate ranks a successful tool for semiparametric inference. Related with those ranks, we propose a new center-outward definition of multivariate distribution and quantile functions, along with their empirical counterparts, for which we establish a Glivenko-Cantelli result. Our approach is based on McCann (Duke Math. J. 80 (1995) 309-323) and our results do not require any moment assumptions. The resulting ranks and signs are shown to be strictly distribution-free and essentially maximal ancillary in the sense of Basu (Sankhya 21 (1959) 247-256) which, in semiparametric models involving noise with unspecified density, can be interpreted as a finite-sample form of semiparametric efficiency. Although constituting a sufficient summary of the sample, empirical center-outward distribution functions are defined at observed values only. A continuous extension to the entire d-dimensional space, yielding smooth empirical quantile contours and sign curves while preserving the essential monotonicity and Glivenko-Cantelli features of the concept, is provided. A numerical study of the resulting empirical quantile contours is conducted.