您的位置: 首页 > 全球经管学术 > 顶刊追踪 > 顶尖期刊 > 信息管理与信息系统 > Information Systems Research > 2024 > 4期

Are Neighbors Alike? A Semisupervised Probabilistic Collaborative Learning Model for Online Review Spammers Detection

成果类型：

Review

署名作者：

Wu, Zhiang; Liu, Guannan; Wu, Junjie; Tan, Yong

署名单位：

Nanjing Audit University; Beihang University; University of Washington; University of Washington Seattle

刊物名称：

INFORMATION SYSTEMS RESEARCH

ISSN/ISSBN：

1047-7047

DOI：

10.1287/isre.2022.0047

发表日期：

2024

页码：

1565-1585

关键词：

fake product manipulation IMPACT

摘要：

Review spammers can harm the trustworthy environment of online platforms by purposefully posting unauthentic ratings and comments for products or online merchants, with the aim of gaining improper benefits. Although many methods have been proposed to resolve the spammer detection problem, several challenges, such as collusion recognition, label scarcity, and biased distributions, are still persistent and call for further investigation. Building on prevalent collusive spamming behaviors and the network homophily theory, we introduce a reviewer network to account for explicit coreview relations, and then, we propose a semisupervised probabilistic collaborative learning model to capture both reviewers' individual behavioral features and the reviewer network. Our model features integrating partial label propagation with a pseudolabeling strategy and feature-based learning for reviewer network modeling, which is proved theoretically to be a weighted logistic regression on a network-derived synthetic data set. The rich parameters that characterize the importance of network information, the strength of network homophily, and the value of unlabeled data make our model more transparent. The empirical evaluations on two distinctive real-life data sets have demonstrated the effectiveness of our model and the significance of unlabeled data, in which the reviewer network after proper trimming demonstrates notable homophily effects and plays a vital role. In particular, the proposed model exhibits robustness against label scarcity and biased label distribution.