A DYNAMIC SCREENING ALGORITHM FOR HIERARCHICAL BINARY MARKETING DATA

成果类型:
Article
署名作者:
Fan, Yimei; Liao, Yuan; Ryzhov, Ilya o.; Zhang, Kunpeng
署名单位:
University System of Maryland; University of Maryland College Park; Rutgers University System; Rutgers University New Brunswick; Rutgers University Newark; University System of Maryland; University of Maryland College Park
刊物名称:
ANNALS OF APPLIED STATISTICS
ISSN/ISSBN:
1932-6157
DOI:
10.1214/22-AOAS1720
发表日期:
2023
页码:
2326-2344
关键词:
GENERALIZED LINEAR-MODELS structured sparsity group lasso regression selection
摘要:
In many applications of business and marketing analytics, predictive models are fit using hierarchically structured data: common characteristics of products, customers, or web pages are represented as categorical variables, and each category can be split up into multiple subcategories at a lower level of the hierarchy. The model may thus contain hundreds of thousands of binary variables, necessitating the use of variable selection to screen out large numbers of irrelevant or insignificant features. We propose a new dynamic screening method, based on the distance correlation criterion, designed for hierarchical binary data. Our method can screen out large parts of the hierarchy at the higher levels, avoiding the need to explore many lower-level features and greatly reducing the computational cost of screening. The practical potential of the method is demonstrated in a case application on user-brand interaction data from Facebook.
来源URL: