Active Feature-Value Acquisition
成果类型:
Article
署名作者:
Saar-Tsechansky, Maytal; Melville, Prem; Provost, Foster
署名单位:
University of Texas System; University of Texas Austin; International Business Machines (IBM); IBM USA; New York University
刊物名称:
MANAGEMENT SCIENCE
ISSN/ISSBN:
0025-1909
DOI:
10.1287/mnsc.1080.0952
发表日期:
2009
页码:
664-684
关键词:
information acquistion
predictive modeling
Active learning
active feature acquisition
Data mining
Machine Learning
business intelligence
imputation
utility-based data mining
摘要:
Most induction algorithms for building predictive models take as input training data in the form of feature vectors. Acquiring the values of features may be costly, and simply acquiring all values may be wasteful or prohibitively expensive. Active feature-value acquisition (AFA) selects features incrementally in an attempt to improve the predictive model most cost-effectively. This paper presents a framework for AFA based on estimating information value. Although straightforward in principle, estimations and approximations must be made to apply the framework in practice. We present an acquisition policy, sampled expected utility (SEU), that employs particular estimations to enable effective ranking of potential acquisitions in settings where relatively little information is available about the underlying domain. We then present experimental results showing that, compared with the policy of using representative sampling for feature acquisition, SEU reduces the cost of producing a model of a desired accuracy and exhibits consistent performance across domains. We also extend the framework to a more general modeling setting in which feature values as well as class labels are missing and are costly to acquire.