The Social and Conceptual Semantic Extended Search Method for Microblog Short Text
-
摘要: 充分挖掘微博短文本的语义以实现精准搜索是一项重要任务.由于微博文本内容具有稀疏性和语义局限性的特点,使得仅通过分析字面语义来进行短文本理解和相似性匹配的传统搜索方法受到了一定的限制.因此提出了一种社交与概念化语义结合的扩展搜索方法,通过挖掘社交网络独特的社交属性如#标签#、“@”和链接信息URL,对微博短文本实现进一步的社交语义扩展.该方法将文本字面分析获取的概念词语和社交关系中潜在的关联标签信息相结合,对短文本进行2种角度下的语义特征表示,实现了基于微博短文本语义充分理解的精准搜索.在微博数据集上的对比实验表明,与已有的扩展搜索方法相比所提方法能捕捉更多的语义特征,微博搜索的性能也得到了显著的提升.Abstract: Mining the semantics of the microblog texts to realize accurate search is an essential task in microblog search. Because the content of the short texts in microblog has the characteristics of sparsity and semantic limitation, the traditional search methods which only analyze the semantics of literal text for short texts understanding and similarity matching have certain restriction. Therefore, we propose an extended search algorithm based on social and conceptual semantics. By exploiting the unique social attributes such as the #hashtag#, the mention “@” and the link information URL in the social network, we further extend the short texts in microblog through the social semantics. The method combines the conceptual words obtained from literal analysis of short texts with the potential associated hashtags information in a graph structure formed by social relationships. It performs the feature representation of short texts in two semantic extensions and achieves the precise search based on full mining of short texts meaning. Finally, we conduct experimental comparisons with traditionally extended search algorithms in the microblog datasets. The results show that the proposed algorithm can capture more semantics and has semantic enhancement function in the search for short texts of microblog. Moreover, the search performance has been significantly improved in the short texts of microblog.
-
-
期刊类型引用(5)
1. 张灵,李荣臻,郑苏. 融合标签语义嵌入和图卷积的短文本特征扩展及分类方法. 广东工业大学学报. 2024(01): 69-78 . 百度学术
2. 吴树芳,王宏彬,朱杰,陈婷. 基于两层异质网络的社交短文本扩展研究. 数据分析与知识发现. 2024(10): 77-88 . 百度学术
3. 张承德,刘雨宣,肖霞,梅凯. 跨媒体语义关联增强的网络视频热点话题检测. 计算机研究与发展. 2023(11): 2624-2637 . 本站查看
4. 曲琦,张正凯,许胜之. 基于LSTM-ICNN的网络情报信息技术研究. 电子测量技术. 2019(18): 144-148 . 百度学术
5. 刘慧清,郭延哺,李红灵,李维华. 基于贝叶斯网的短文本特征扩展方法. 计算机科学. 2019(S2): 66-71 . 百度学术
其他类型引用(5)
计量
- 文章访问数: 1118
- HTML全文浏览量: 1
- PDF下载量: 725
- 被引次数: 10