TM-OKC: A N U NSUPERVISED T OPIC M ODEL FOR T EXT IN O NLINE K NOWLEDGE C OMMUNITIES

成果类型:
Article
署名作者:
Zhang, Dongcheng; Zhang, Kunpeng; Yang, Yi; Schweidel, David A.
署名单位:
Chinese University of Hong Kong; University System of Maryland; University of Maryland College Park; Hong Kong University of Science & Technology; Emory University
刊物名称:
MIS QUARTERLY
ISSN/ISSBN:
0276-7783
DOI:
10.25300/MISQ/2023/17885
发表日期:
2024
页码:
931-978
关键词:
variational inference ONLINE COMMUNITIES INFORMATION network reviews answers QUALITY IMPACT MODEL text
摘要:
Online knowledge communities (OKCs), such as question-and-answer sites, have become increasingly popular venues for knowledge sharing. Accordingly, it is necessary for researchers and practitioners to develop effective and efficient text analysis tools to understand the massive amount of user-generated content (UGC) on OKCs. Unsupervised topic modeling has been widely adopted to extract human- interpretable latent topics embedded in texts. These identified topics can be further used in subsequent analysis and managerial practices. However, existing generic topic models that assume documents are independent are inappropriate for analyzing OKCs where structural relationships exist between questions and answers. Thus, a new method is needed to fill this research gap. In this study, we propose a new topic model specifically designed for the text in OKCs. We make three primary contributions to the research on topic modeling in this context. First, we build a general and flexible Bayesian framework to explicitly model structural and temporal dependencies among texts. Second, we statistically demonstrate the approximate model inference using mean-field and coordinate ascent algorithms. Third, we showcase the practical value and relative merit of our method via a specific downstream task (i.e., user profiling). The proposed model is illustrated using two real-world datasets from well-known OKCs (i.e., Stack Exchange and Quora), and extensive experiments demonstrate its superiority over several cutting-edge benchmarks.
来源URL: