Are Neighbors Alike? A Semisupervised Probabilistic Collaborative Learning Model for Online Review Spammers Detection

成果类型:
Review
署名作者:
Wu, Zhiang; Liu, Guannan; Wu, Junjie; Tan, Yong
署名单位:
Nanjing Audit University; Beihang University; University of Washington; University of Washington Seattle
刊物名称:
INFORMATION SYSTEMS RESEARCH
ISSN/ISSBN:
1047-7047
DOI:
10.1287/isre.2022.0047
发表日期:
2024
页码:
1565-1585
关键词:
fake product manipulation IMPACT
摘要:
Review spammers can harm the trustworthy environment of online platforms by purposefully posting unauthentic ratings and comments for products or online merchants, with the aim of gaining improper benefits. Although many methods have been proposed to resolve the spammer detection problem, several challenges, such as collusion recognition, label scarcity, and biased distributions, are still persistent and call for further investigation. Building on prevalent collusive spamming behaviors and the network homophily theory, we introduce a reviewer network to account for explicit coreview relations, and then, we propose a semisupervised probabilistic collaborative learning model to capture both reviewers' individual behavioral features and the reviewer network. Our model features integrating partial label propagation with a pseudolabeling strategy and feature-based learning for reviewer network modeling, which is proved theoretically to be a weighted logistic regression on a network-derived synthetic data set. The rich parameters that characterize the importance of network information, the strength of network homophily, and the value of unlabeled data make our model more transparent. The empirical evaluations on two distinctive real-life data sets have demonstrated the effectiveness of our model and the significance of unlabeled data, in which the reviewer network after proper trimming demonstrates notable homophily effects and plays a vital role. In particular, the proposed model exhibits robustness against label scarcity and biased label distribution.