高级检索

    基于概率推理模型的博客倾向性检索研究

    Research on Blog Opinion Retrieval Based on Probabilistic Inference Model

    • 摘要: 近年来博客作为一种新兴的大众化新闻发布媒介越来越受到人们和业界的关注.博客之间通过互相引用、互相推荐形成一个巨大的博客空间.在博客空间中,人们既可以自由发表对现实生活各种问题的观点,表达自己的情感,也可以对市场上出现的新产品进行评论.准确检索出博客空间中人们对重要话题、热点事件的观点看法对市场调研、网络舆情发现与预警等应用有重要意义.博客倾向性检索的目标是检索出与给定查询既要主题相关又要有与该查询相关评论的博文.为实现该目标,把概率推理模型应用于博客倾向性检索中,提出一个基于概率推理模型的博客倾向性检索算法.该算法把主题相关性评分和倾向性评分合并到一个统一的概率推理理论模型,能够有效计算博文中出现的主题描述与查询的主题相关性,合理度量倾向性词描述查询主题的倾向性强弱,并融合二者分数形成最后整体评分.实验表明,该算法能够有效地识别博客空间中与给定查询相关的观点,获得较好的结果.

       

      Abstract: In recent years people and enterprises have paid more and more attention on the fast growth of new media blog (derived from “Web blog”) Web sites. Blogs constitute a huge blogsphere by trackbacking and recommending each other. In this blogsphere, people could freely express their opinion and feelings about topic they interested in, and could also comment on new product in the market. Retrieving blogger’s opinion on lead story and hot topic is very important for applications such as market survey, network public opinion discovery and warning. The goal of blog opinion retrieval is to retrieve the blog post that not only relate to given query but also has comment on the query. The paper introduces probabilistic inference model into blog opinion retrieval, and presents an algorithm based on probabilistic inference model. The model combines topical scoring and sentiment scoring to a uniform probabilistic inference theory model, could effectively reveal the topical facets between blog post and query and the strength of sentiments about the given query and then combine the resulting topical score and sentiment score to constitute final score. Experiment result shows that the algorithm could effectively model the topical facets and sentiments, and could also identify the opinions about given query.

       

    /

    返回文章
    返回