多模态深度伪造及检测技术综述

李泽宇; 张旭鸿; 蒲誉文; 伍一鸣; 纪守领

doi:10.7544/issn1000-1239.202111119

多模态深度伪造及检测技术综述

A Survey on Multimodal Deepfake and Detection Techniques

摘要

摘要: 随着各种深度学习生成模型在各领域的应用，生成的多媒体文件的真伪越来越难以辨别，深度伪造技术也因此得以诞生和发展. 深度伪造技术通过深度学习相关技术能够篡改视频或者图片中的人脸身份信息、表情和肢体动作，以及生成特定人物的虚假语音. 自2018年Deepfakes技术在社交网络上掀起换脸热潮开始，大量的深度伪造方法被提出，并展现了其在教育、娱乐等领域的潜在应用. 但同时深度伪造技术在社会舆论、司法刑侦等方面产生的负面影响也不容忽视. 因此有越来越多的对抗手段被提出用于防止深度伪造被不法分子所应用，如深度伪造的检测和水印. 首先，针对不同模态类型的深度伪造技术以及相应的检测技术进行了回顾和总结，并根据研究目的和研究方法对现有的研究进行了分析和归类；其次，总结了近年研究中广泛使用的视频和音频数据集；最后，探讨了该领域未来发展面临的机遇和挑战.

Abstract: With the application of all kinds of deep learning generation models in various fields, the authenticity of their generated multimedia files has become increasingly difficult to distinguish, therefore, deepfake technology has been born and developed. Utilizing deep learning related techniques, the deepfake technology can tamper with the facial identity information, expressions, and body movements in videos or pictures, and generate fake voice of a specific person. Since 2018, when Deepfakes sparked a wave of face swapping on social networks, a large number of deepfake methods have been proposed, which had demonstrated their potential applications in education, entertainment, and some other fields. But at the same time, the negative impact of deepfake on public opinion, judicial and criminal investigations, etc. can not be ignored. As a consequence, more and more countermeasures have been proposed to prevent deepfake from being utilized by the criminals, such as the detection of deepfake and watermark. Firstly, a review and summary of deepfake technologies of different modal types and corresponding detection technologies are carried out, and the existing researches are analyzed and classified according to the research purpose and research method. Secondly, the video and audio datasets widely used in the recent studies are summarized. Finally, the opportunities and challenges for future development in this field are discussed.

HTML全文

参考文献(181)

施引文献

资源附件(0)