ISSN 1000-1239 CN 11-1777/TP

计算机研究与发展 ›› 2021, Vol. 58 ›› Issue (7): 1456-1465.doi: 10.7544/issn1000-1239.2021.20200804

所属专题: 2021虚假信息检测专题

• 信息处理 • 上一篇    下一篇



  1. (中国科学院智能信息处理重点实验室(中国科学院计算技术研究所) 北京 100190) (中国科学院计算技术研究所 北京 100190) (中国科学院大学 北京 100049) (
  • 出版日期: 2021-07-01
  • 基金资助: 

Semantics-Enhanced Multi-Modal Fake News Detection

Qi Peng, Cao Juan, Sheng Qiang   

  1. (Key Laboratory of Intelligent Information Processing of Chinese Academy of Science (Institute of Computing Technology, Chinese Academy of Sciences), Beijing 100190) (Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190) (University of Chinese Academy of Sciences, Beijing 100049)
  • Online: 2021-07-01
  • Supported by: 
    This work was supported by the Key Program of the National Natural Science Foundation of China (U1703261).

摘要: 近年来社交媒体逐渐成为人们获取新闻信息的主要渠道,但其在给人们带来方便的同时也促进了虚假新闻的传播.在社交媒体的富媒体化趋势下,虚假新闻逐渐由单一的文本形式向多模态形式转变,因此多模态虚假新闻检测正在受到越来越多的关注.现有的多模态虚假新闻检测方法大多依赖于和数据集高度相关的表现层面特征,对新闻的语义层面特征建模不足,难以理解文本和视觉实体的深层语义,在新数据上的泛化能力受限.提出了一种语义增强的多模态虚假新闻检测方法,通过利用预训练语言模型中隐含的事实知识以及显式的视觉实体提取,更好地理解多模态新闻的深层语义.提取不同语义层次的视觉特征,在此基础上采用文本引导的注意力机制建模图文之间的语义交互,从而更好地融合多模态异构特征.在基于微博新闻的真实数据集上的实验结果表明:该方法能够有效提高多模态虚假新闻检测的性能.

关键词: 社交媒体, 虚假新闻检测, 多模态, 知识融合, 注意力机制

Abstract: In recent years, social media has become the main access where people acquire the latest news. However, the convenience and openness of social media have also facilitated the proliferation of fake news. With the development of multimedia technology, fake news on social media has been evolving from text-only posts to multimedia posts containing images or videos. Therefore, multi-modal fake news detection is attracting more and more attention. Existing methods for multi-modal fake news detection mostly focus on capturing appearance-level features that are highly dependent on the dataset distribution but insufficiently exploit the semantics-level features. Thus, the methods often fail to understand the deep semantics of textual and visual entities in the fake news, which indeed limits the generalizability of models in real applications. To tackle this problem, this paper proposes a semantics-enhanced multi-modal model for fake news detection, which better models the underlying semantics of multi-modal news by implicitly utilizing the factual knowledge in the pre-trained language model and explicitly extracting the visual entities. Furthermore, the proposed method extracts visual features of different semantic levels and models the semantic interaction between the textual and visual features by the text-guided attention mechanism, which better fuses the multi-modal heterogeneous features. Extensive experiments on the Weibo dataset strongly evidence that our method outperforms the state of the art significantly.

Key words: social media, fake news detection, multi-modal, knowledge fusion, attention mechanism