A SIMPLE MEASURE OF CONDITIONAL DEPENDENCE
成果类型:
Article
署名作者:
Azadkia, Mona; Chatterjee, Sourav
署名单位:
Stanford University
刊物名称:
ANNALS OF STATISTICS
ISSN/ISSBN:
0090-5364
DOI:
10.1214/21-AOS2073
发表日期:
2021
页码:
3070-3102
关键词:
VARIABLE SELECTION
Dimension Reduction
regression
INDEPENDENCE
copula
摘要:
We propose a coefficient of conditional dependence between two random variables Y and Z given a set of other variables X-1, ..., X-p, based on an i.i.d. sample. The coefficient has a long list of desirable properties, the most important of which is that under absolutely no distributional assumptions, it converges to a limit in [0, 1], where the limit is 0 if and only if Y and Z are conditionally independent given X-1, ..., X-p, and is 1 if and only if Y is equal to a measurable function of Z given X-1, ..., X-p. Moreover, it has a natural interpretation as a nonlinear generalization of the familiar partial R-2 statistic for measuring conditional dependence by regression. Using this statistic, we devise a new variable selection algorithm, called Feature Ordering by Conditional Independence (FOCI), which is model-free, has no tuning parameters, and is provably consistent under sparsity assumptions. A number of applications to synthetic and real data sets are worked out.