Analyzing consumer-product graphs: Empirical findings and applications in recommender systems

成果类型:
Article
署名作者:
Huang, Zan; Zeng, Daniel D.; Chen, Hsinchun
署名单位:
Pennsylvania Commonwealth System of Higher Education (PCSHE); Pennsylvania State University; Pennsylvania State University - University Park; University of Arizona
刊物名称:
MANAGEMENT SCIENCE
ISSN/ISSBN:
0025-1909
DOI:
10.1287/mnsc.1060.0619
发表日期:
2007
页码:
1146-1164
关键词:
random graph theory consumer-purchase behavior topological features recommender systems collaborative filtering
摘要:
We apply random graph modeling methodology to analyze bipartite consumer-product graphs that represent sales transactions to better understand consumer purchase behavior in e-commerce settings. Based on two real-world e-commerce data sets, we found that such graphs demonstrate topological features that deviate significantly from theoretical predictions based on standard random graph models. In particular, we observed consistently larger-than-expected average path lengths and a greater-than-expected tendency to cluster. Such deviations suggest that the consumers' product choices are not random even with the consumer and product attributes hidden. Our findings provide justification for a large family of collaborative filtering-based recommendation algorithms that make product recommendations based only on previous sales transactions. By analyzing the simulated consumer-product graphs generated by models that embed two representative recommendation algorithms, we found that these recommendation algorithm-induced graphs generally provided a better match with the real-world consumer-product graphs than purely random graphs. However, consistent deviations in topological features remained. These findings motivated the development of a new recommendation algorithm based on graph partitioning, which aims to achieve high clustering coefficients similar to those observed in the real-world e-commerce data sets. We show empirically that this algorithm significantly outperforms representative collaborative filtering algorithms in situations where the observed clustering coefficients of the consumer-product graphs are sufficiently larger than can be accounted for by these standard algorithms.