Empirical geometry of multivariate data: A deconvolution approach

成果类型:
Article
署名作者:
Koltchinskii, VI
署名单位:
University of New Mexico
刊物名称:
ANNALS OF STATISTICS
ISSN/ISSBN:
0090-5364
DOI:
10.1214/aos/1016218232
发表日期:
2000
页码:
591-629
关键词:
optimal rates density distributions CONVERGENCE BOUNDARIES
摘要:
Let {Y (j): j = 1,..., n} be independent observations in R-m, m greater than or equal to 1 with common distribution Q. Suppose that Y (j) = X (j) + xi (j), j = 1,...,n, where {X (j), xi (j), j = 1,...,n} are independent, X (j), j = 1,..., n have common distribution P and xi (j), j = 1,...,n have common distribution mu, so that Q = P * mu. The problem is to recover hidden geometric structure of the support of P based an the independent observations Y (j). Assuming that the distribution of the errors mu is known, deconvolution statistical estimates of the fractal dimension and the hierarchical cluster tree of the support that converge with exponential rates are suggested. Moreover, the exponential rates of convergence hold for adaptive versions of the estimates even in the case of normal noise xi (j) with unknown covariance. In the case of the dimension estimation, though, the exponential rate holds only when the set of all possible values of the dimension is finite (e.g., when the dimension is known to be integer). If this set is infinite, the optimal convergence rate of the estimator becomes very slow (typically, logarithmic), even when there is no noise in the observations.