AN OPTIMAL VARIABLE CELL HISTOGRAM BASED ON THE SAMPLE SPACINGS
成果类型:
Article
署名作者:
KANAZAWA, Y
刊物名称:
ANNALS OF STATISTICS
ISSN/ISSBN:
0090-5364
DOI:
10.1214/aos/1176348523
发表日期:
1992
页码:
291-304
关键词:
density
摘要:
Suppose we wish to construct a variable k-cell histogram based on an independent identically distributed sample of size n - 1 from an unknown density f on the interval of finite length. A variable cell histogram requires cutpoints and heights of all of its cells to be specified. We propose the following procedure: (i) choose from the order statistics corresponding to the sample a set of k + 1 cutpoints that maximize a criterion, a function of the sample spacings; (ii) compute heights of the k cells according to a formula. The resulting histogram estimates a k-cell theoretical histogram that stays constant within a cell and that minimizes the Hellinger distance to the density f. The histogram tends to estimate low density regions accurately and is easy to compute. We find the number of cells of order n1/3 minimizes the mean Hellinger distance between the density f and a class of histograms whose cutpoints are chosen from the order statistics.