Topic Mining for Microblog Based on MB-LDA Model
-
Graphical Abstract
-
Abstract
As microblog grows more popular, services like Twitter have become information providers on a web scale. Early work on microblog focused more on its user relationship and community structure, without considering the value of content. So the research on microblog requires a change from solely user’s relationship analysis to its content mining. Although traditional text mining methods have been studied well, no algorithm is designed specially for microblog data, which contain structured information on social network besides plain text. In this paper, we propose a novel probabilistic generative model based on LDA, called MB-LDA, which is suitable to model the microblog data and takes both contact relation and document relation into consideration to help topic mining in microblog. We present a Gibbs sampling implementation for inference of our model, and find not only the topics of microblog, but also the topics focused by contactors according to the final results. Besides, our model can be extended to many texts associated with social networking such as E-mails and forum posts. Experimental results on actual dataset show that MB-LDA model can offer an effective solution to topic mining for microblog.
-
-