您的位置: 首页 > 全球经管学术 > 顶刊追踪 > 顶尖期刊 > 统计学 > The Annals of Statistics > 1992 > 3期

MINIMUM IMPURITY PARTITIONS

成果类型：

Note

署名作者：

BURSHTEIN, D; DELLAPIETRA, V; KANEVSKY, D; NADAS, A

署名单位：

International Business Machines (IBM); IBM USA

刊物名称：

ANNALS OF STATISTICS

ISSN/ISSBN：

0090-5364

DOI：

10.1214/aos/1176348789

发表日期：

1992

页码：

1637-1646

关键词：

摘要：

Let (X, U) be jointly distributed on K X R(n). Let Y = E(U\X) and let U be the convex hull of the range of U. Let C: K --> C = {1, 2, ..., k}, k greater-than-or-equal-to 1, induce a measurable k way partition {K1, ...,K(k)} of K. Define the impurity of K(c) = C-1(c) to be phi(c, E(U\C(X) = c)), where phi: C x U --> R1 is a concave function in its second argument. Define the impurity psi(C) of the partition as the average impurity of its members: psi(C) = Ephi(C(X), E(U\C(X))). We show that for any C: K --> C there exists a mapping C: U --> C, such that psi(C(Y)) less-than-or-equal-to psi(C) and such that C-1(c) is convex, that is, for each i, j is-an-element-of C, i not-equal j, there exists a separating hyperplane between C-1(i) and C-1(j). This generalizes some results in statistics and information theory. Suitable choices of U and phi lead to optimal partitions of simple form useful in the construction of classification trees and multidimensional regression trees.