高级检索

    EMTM:微博中与主题相关的专家挖掘方法

    EMTM: A Method for Experts Mining in Micro-Blog with Topic-Level

    • 摘要: 目前,微博已成为人们获取信息、分享信息的最流行平台之一.经过长期的发展积累,微博中聚集了很多具有权威专业知识背景的专家,挖掘微博中与主题相关的专家有利于进一步地用户推荐、微博舆情分析等工作.在微博中,与某个主题相关的专家是指因具有可靠的与此主题相关的专业知识或技能而在此主题下具有高影响力的用户.挖掘高影响力的用户可以通过分析微博的转发数据来进行,然而由于微博中用户的转发行为分为“主题相关转发”和“跟随转发”2种,因此,因被转发概率高而具有高影响力的用户不一定是专家.EMTM(experts mining topic model)是一种基于主题模型的概率生成模型,通过区分微博用户的不同转发行为来挖掘微博中与主题相关的专家.模型采用Gibbs采样进行推理求解.在真实的新浪微博数据集上的对比实验表明EMTM能够有效地挖掘微博中与主题相关的专家.

       

      Abstract: So far, micro-blog has been one of the most popular platforms for people to access and share information. After long-term development, there are many experts with authoritative professional background knowledge. Mining experts in topic-level will contribute to the user recommendation and public opinion analysis in micro-blog. In micro-blog, experts in a topic are the users who have high influence on the topic, since they have authoritative professional knowledge and skills about the topic. High influence is a necessary condition for experts. Influence analysis belongs to subjective problems and need to be quantified objectively. In micro-blog, the probability of being retweeted is one of the most important indexes to measure the influence of users. So we can find out the high influencers by analyzing the retweet data. But, there are two kinds of retweet behaviors for the users in micro-blog: “topic-sensitive retweet” and “following retweet”. Therefore, the users who have high influence because of being retweeted with high probability are not always experts. In this paper, we propose a probability generation model EMTM (experts mining topic model) which can find out the experts in topic-level by distinguishing two kinds of the retweet behaviors. We use Gibbs sampling for model inference. Our experiments on real Sina Weibo data show that our model EMTM is effective in mining experts in topic-level.

       

    /

    返回文章
    返回