sDTM: A Supervised Bayesian Deep Topic Model for Text Analytics
成果类型:
Article
署名作者:
Yang, Yi; Zhang, Kunpeng; Fan, Yangyang
署名单位:
Hong Kong University of Science & Technology; University System of Maryland; University of Maryland College Park; Hong Kong Polytechnic University
刊物名称:
INFORMATION SYSTEMS RESEARCH
ISSN/ISSBN:
1047-7047
DOI:
10.1287/isre.2022.1124
发表日期:
2023
页码:
137-156
关键词:
big data
online reviews
IMPACT
CLASSIFICATION
INFORMATION
responses
摘要:
Topic modeling methods such as latent Dirichlet allocation (LDA) are powerful tools for analyzing massive amounts of textual data. They have been used extensively in information systems (IS) and business discipline research to identify latent topics for data exploration and as a feature engineering mechanism to derive new variables for analyses. However, existing topic modeling approaches are mostly unsupervised and only leverage textual data, while ignoring additional useful metadata often associated with text, such as star ratings in customer reviews or categories of posts in online forums. As a result, the identified topics and variables derived based on the learned topic model may not be accurate, which could lead to incorrect estimations that affect subsequent empirical analysis and to inferior performance on predictive tasks. In this study, we propose a novel supervised deep topic modeling approach called sDTM, which combines a neural variational autoencoder model and a recurrent neural network. sDTM leverages the auxiliary data associated with text to enhance the topic modeling capability. We conduct empirical case studies and predictive analytics on an online consumer review data set and an online knowledge community data set. Experimental results show that in comparison with benchmark methods, sDTM can enhance both the empirical estimation and predictive performance. sDTM makes methodological contributions to the IS literature and has direct relevance for research using text analytics.
来源URL: