ISSN 1000-1239 CN 11-1777/TP

Journal of Computer Research and Development ›› 2015, Vol. 52 ›› Issue (5): 1022-1028.doi: 10.7544/issn1000-1239.2015.20131549

Previous Articles     Next Articles

Microblog Bursty Topic Detection Method Based on Momentum Model

He Min1,2, Du Pan1, Zhang Jin1, Liu Yue1, Cheng Xueqi1   

  1. 1(Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190); 2(National Computer Network Emergency Response Technical Team/Coordination Center of China, Beijing 100029)
  • Online:2015-05-01

Abstract: Microblogs reflect the general public's real-time reaction to major events. Finding bursty topics from microblogs is an important task to understand the current events which attract a large number of Internet users. However, the existing methods suitable for news articles aren't adopted directly for microblogs, because microblogs have unique characteristics compared with formal texts, including diversity, dynamic and noise. In this paper, a new detection method for microblog bursty topic is proposed based on momentum model. The meaningful strings are extracted from micorblog posts in the special time window as the microblog dynamic features. The dynamic characteristics of these features are modeled by the principle of momentum. The velocity, accelerated velocity and momentum of the features are defined by the dynamic frequencies at different dimensions. The bursty features are detected with the combination of momentum, variation trend and second order change rate. By merging the detected bursty features with mutual information, the bursty topics are obtained. The experiments are conducted on a real Sina microblog data set containing around 526 thousand posts of 1000 users, and results show that the proposed method improves the precision and recall remarkably compared with the conventional methods. The proposed method could be well applied in online bursty topic detection for microblog information.

Key words: bursty topic, microblog, bursty feature, meaningful string, momentum model

CLC Number: