高级检索

    面向微博短文本的社交与概念化语义扩展搜索方法

    The Social and Conceptual Semantic Extended Search Method for Microblog Short Text

    • 摘要: 充分挖掘微博短文本的语义以实现精准搜索是一项重要任务.由于微博文本内容具有稀疏性和语义局限性的特点,使得仅通过分析字面语义来进行短文本理解和相似性匹配的传统搜索方法受到了一定的限制.因此提出了一种社交与概念化语义结合的扩展搜索方法,通过挖掘社交网络独特的社交属性如#标签#、“@”和链接信息URL,对微博短文本实现进一步的社交语义扩展.该方法将文本字面分析获取的概念词语和社交关系中潜在的关联标签信息相结合,对短文本进行2种角度下的语义特征表示,实现了基于微博短文本语义充分理解的精准搜索.在微博数据集上的对比实验表明,与已有的扩展搜索方法相比所提方法能捕捉更多的语义特征,微博搜索的性能也得到了显著的提升.

       

      Abstract: Mining the semantics of the microblog texts to realize accurate search is an essential task in microblog search. Because the content of the short texts in microblog has the characteristics of sparsity and semantic limitation, the traditional search methods which only analyze the semantics of literal text for short texts understanding and similarity matching have certain restriction. Therefore, we propose an extended search algorithm based on social and conceptual semantics. By exploiting the unique social attributes such as the #hashtag#, the mention “@” and the link information URL in the social network, we further extend the short texts in microblog through the social semantics. The method combines the conceptual words obtained from literal analysis of short texts with the potential associated hashtags information in a graph structure formed by social relationships. It performs the feature representation of short texts in two semantic extensions and achieves the precise search based on full mining of short texts meaning. Finally, we conduct experimental comparisons with traditionally extended search algorithms in the microblog datasets. The results show that the proposed algorithm can capture more semantics and has semantic enhancement function in the search for short texts of microblog. Moreover, the search performance has been significantly improved in the short texts of microblog.

       

    /

    返回文章
    返回