面向科技资讯的基于语义对抗和媒体对抗的跨媒体检索方法

李昂; 杜军平; 寇菲菲; 薛哲; 徐欣; 许明英; 姜阳

doi:10.7544/issn1000-1239.202220430

面向科技资讯的基于语义对抗和媒体对抗的跨媒体检索方法

Scientific and Technological Information Oriented Semantics-Adversarial and Media-Adversarial Based Cross-Media Retrieval Method

摘要

摘要: 科技资讯跨媒体检索是跨媒体领域的重要任务之一，面临着多媒体数据间异构鸿沟和语义鸿沟亟待打破的难题. 通过跨媒体科技资讯检索，用户能够从多源异构的海量科技资源中获取目标科技资讯. 这有助于设计出符合用户需求的应用，包括科技资讯推荐、个性化科技资讯检索等. 跨媒体检索研究的核心是学习一个公共子空间，使得不同媒体的数据在该子空间中可以直接相互比较. 在子空间学习中，现有方法往往聚焦于建模媒体内数据的判别性和媒体间数据在映射后的不变性，却忽略了媒体间数据在映射前后的语义一致性和语义内的媒体判别性，使得跨媒体检索效果存在局限性. 鉴于此，提出一种面向科技资讯的基于语义对抗和媒体对抗的跨媒体检索方法（SMCR），寻找可供映射的有效公共子空间. 具体而言，SMCR在建模媒体内语义判别性之外，将媒体间语义一致性损失最小化，以保留映射前后的语义相似性. 此外，SMCR构建基础特征映射网络和精炼特征映射网络，联合最小化语义内的媒体判别性损失，有效增强了特征映射网络混淆媒体判别网络的能力. 在2个数据集上的大量实验结果表明，所提出的SMCR方法在跨媒体检索中的表现优于最前沿的方法.

Abstract: Cross-media retrieval of scientific and technological information is one of the important tasks in the cross-media study. Cross-media scientific and technological information retrieval obtains target information from massive multi-source and heterogeneous scientific and technological resources, which helps to design applications that meet users’ needs, including scientific and technological information recommendation, personalized scientific and technological information retrieval, etc. The core of cross-media retrieval is to learn a common subspace, in which data from different media can be directly compared with each other. In subspace learning, existing methods often focus on modeling the discrimination of intra-media data and the invariance of inter-media data after mapping, while ignoring semantic consistency within media and media discrimination within semantics, which limits the result of cross-media retrieval. In light of this, we propose a scientific and technological information oriented semantics-adversarial and media-adversarial cross-media retrieval method (SMCR) to find an effective common subspace. Specifically, SMCR minimizes the loss of inter-media semantic consistency in addition to modeling intra-media semantic discrimination, to preserve semantic similarity before and after mapping. Furthermore, SMCR constructs a basic feature mapping network and a refined feature mapping network to jointly minimize the media discriminative loss within semantics, to enhance the feature mapping network’s ability to confuse the media discriminant network. Experimental results on two datasets demonstrate that the proposed SMCR outperforms state-of-the-art methods in cross-media retrieval.

HTML全文

参考文献(40)

施引文献

资源附件(0)