跨媒体语义关联增强的网络视频热点话题检测

张承德; 刘雨宣; 肖霞; 梅凯

doi:10.7544/issn1000-1239.202220560

跨媒体语义关联增强的网络视频热点话题检测

Hot Topic Detection of Web Video Based on Cross-Media Semantic Association Enhancement

摘要

摘要: 跨媒体网络视频热点话题检测成为新的研究热点.然而，描述视频的文本信息较少，使得文本语义特征空间稀疏，导致文本语义特征间关联强度较弱，增加了挖掘热点话题的难度.现有方法主要通过视觉信息丰富文本语义特征空间.然而，由于视觉与文本信息间的异构性，导致同一话题下文本与视觉语义特征差异较大，这进一步降低了同一话题下文本语义间的关联强度，也给跨媒体网络视频热点话题检测带来巨大挑战.因此，提出一种新的跨媒体语义关联增强方法.首先，通过双层注意力，从单词和句子2个级别捕捉文本核心语义特征；其次，通过理解视觉内容，生成大量与视频内容高度相关的文本描述，丰富文本语义空间；然后，分别通过文本语义相似性和视觉语义相似性，构建文本语义图和视觉语义图，并构造时间衰减函数，从时间维度建立跨媒体数据间的相关性，以此增强文本与视觉语义间的关联强度，平滑地将2种语义图融合为混合语义图，实现跨媒体语义互补；最后，通过图聚类方法检测出热点话题.大量实验结果表明，提出的模型优于现有方法.

Abstract: Cross-media web video hot topic detection has become a new research hotspot. However, there is less text information to describe video, which makes the space of text semantic features sparse, resulting in weak correlation between text semantic features, which increases the difficulty of mining hot topics. The existing methods mainly enrich the text semantic feature space through visual information. However, due to the heterogeneity between visual and text information, the semantic features of text and visual are quite different under the same topic. This further reduces the correlation strength between text semantics under the same topic, and also brings great challenges to cross-media hot topic detection based on web videos. Therefore, we propose a new cross-media semantic association enhancement method. Firstly, the core semantic features of the text from the word and sentence levels through double-layer attention are captured; Secondly, by understanding the visual content, a large number of text descriptions highly related to the video content are generated to enrich the text semantic space; Then, through text semantic similarity and visual semantic similarity, the text semantic map and visual semantic map are constructed, and the time decay function is constructed to establish the correlation between cross-media data from the time dimension, so as to enhance the correlation strength between text and visual semantics, and smoothly fuse the two semantic maps into a hybrid semantic map to realize cross-media semantic complementarity; Finally, hot topics are detected by graph clustering method. A large number of experimental results show that the proposed model is superior to the existing methods.

HTML全文

参考文献(46)

施引文献

资源附件(0)