ISSN 1000-1239 CN 11-1777/TP

Journal of Computer Research and Development ›› 2015, Vol. 52 ›› Issue (11): 2517-2526.doi: 10.7544/issn1000-1239.2015.20148133

Previous Articles     Next Articles

EMTM: A Method for Experts Mining in Micro-Blog with Topic-Level

Zhang Lamei1,2,3,4, Huang Weijing3,4, Chen Wei3,4, Wang Tengjiao3,4, Lei Kai1,2   

  1. 1(Shenzhen Key Laboratory for Cloud Computing Technology & Applications (Peking University), Shenzhen, Guangdong 518055);2(School of Electronics and Computer Engineering, Peking University, Shenzhen, Guangdong 518055);3(Key Laboratory of High Confidence Software Technologies (Peking University), Ministry of Education, Beijing 100871);4(School of Electronics Engineering and Computer Science, Peking University, Beijing 100871)
  • Online:2015-11-01

Abstract: So far, micro-blog has been one of the most popular platforms for people to access and share information. After long-term development, there are many experts with authoritative professional background knowledge. Mining experts in topic-level will contribute to the user recommendation and public opinion analysis in micro-blog. In micro-blog, experts in a topic are the users who have high influence on the topic, since they have authoritative professional knowledge and skills about the topic. High influence is a necessary condition for experts. Influence analysis belongs to subjective problems and need to be quantified objectively. In micro-blog, the probability of being retweeted is one of the most important indexes to measure the influence of users. So we can find out the high influencers by analyzing the retweet data. But, there are two kinds of retweet behaviors for the users in micro-blog: “topic-sensitive retweet” and “following retweet”. Therefore, the users who have high influence because of being retweeted with high probability are not always experts. In this paper, we propose a probability generation model EMTM (experts mining topic model) which can find out the experts in topic-level by distinguishing two kinds of the retweet behaviors. We use Gibbs sampling for model inference. Our experiments on real Sina Weibo data show that our model EMTM is effective in mining experts in topic-level.

Key words: experts, topic, micro-blog, retweet behavior, probability model

CLC Number: