面向自然语言处理的对抗攻防与鲁棒性分析综述

郑海斌; 陈晋音; 章燕; 张旭鸿; 葛春鹏; 刘哲; 欧阳亦可; 纪守领

doi:10.7544/issn1000-1239.2021.20210304

面向自然语言处理的对抗攻防与鲁棒性分析综述

¹(浙江工业大学信息工程学院杭州 310023)
²(浙江工业大学网络空间安全研究院杭州 310023)
³(浙江大学控制科学与工程学院杭州 310063)
⁴(南京航空航天大学计算机科学与技术学院南京 211106)
⁵(华为技术有限公司南京研究所南京 210029)
⁶(浙江大学计算机科学与技术学院杭州 310063) (haibinzheng320@gmail.com)

基金项目: 国家自然科学基金项目(62072406)；浙江省自然科学基金项目(LY19F020025);宁波市“科技创新2025”重大专项(2018B10063)

详细信息

中图分类号: TP391
计量
- 文章访问数: 1473
- HTML全文浏览量: 9
- PDF下载量: 1082
出版历程
- 发布日期: 2021-07-31

Survey of Adversarial Attack, Defense and Robustness Analysis for Natural Language Processing

¹(College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023)
²(Cyberspace Security Research Institute, Zhejiang University of Technology, Hangzhou 310023)
³(College of Control Science and Engineering, Zhejiang University, Hangzhou 310063)
⁴(College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, Nanjing 211106)
⁵(Nanjing Research Center, Huawei Technologies Co., Ltd., Nanjing 210029)
⁶(College of Computer Science and Technology, Zhejiang University, Hangzhou 310063)

Funds: This work was supported by the National Natural Science Foundation of China (62072406), the Natural Science Foundation of Zhejiang Province (LY19F020025), and the Major Special Funding for “Science and Technology Innovation 2025” in Ningbo (2018B10063).

摘要

摘要: 随着人工智能技术的飞速发展，深度神经网络在计算机视觉、信号分析和自然语言处理等领域中都得到了广泛应用.自然语言处理通过语法分析、语义分析、篇章理解等功能帮助机器处理、理解及运用人类语言.但是，已有研究表明深度神经网络容易受到对抗文本的攻击，通过产生不可察觉的扰动添加到正常文本中，就能使自然语言处理模型预测错误.为了提高模型的鲁棒安全性，近年来也出现了防御相关的研究工作.针对已有的研究，全面地介绍自然语言处理攻防领域的相关工作，具体而言，首先介绍了自然语言处理的主要任务与相关方法；其次，根据攻击和防御机制对自然语言处理的攻击方法和防御方法进行分类介绍；然后，进一步分析自然语言处理模型的可验证鲁棒性和评估基准数据集，并提供自然语言处理应用平台和工具包的详细介绍；最后总结面向自然语言处理的攻防安全领域在未来的研究发展方向.
- 深度神经网络 /
- 自然语言处理 /
- 对抗攻击 /
- 防御 /
- 鲁棒性
Abstract: With the rapid development of artificial intelligence, deep neural networks have been widely applied in the fields of computer vision, signal analysis, and natural language processing. It helps machines process understand and use human language through functions such as syntax analysis, semantic analysis, and text comprehension. However, existing studies have shown that deep models are vulnerable to the attacks from adversarial texts. Adding imperceptible adversarial perturbations to normal texts, natural language processing models can make wrong predictions. To improve the robustness of the natural language processing model, defense-related researches have also developed in recent years. Based on the existing researches, we comprehensively detail related works in the field of adversarial attacks, defenses, and robustness analysis in natural language processing tasks. Specifically, we first introduce the research tasks and related natural language processing models. Then, attack and defense approaches are stated separately. The certified robustness analysis and benchmark datasets of natural language processing models are further investigated and a detailed introduction of natural language processing application platforms and toolkits is provided. Finally, we summarize the development direction of research on attacks and defenses in the future.
- deep neural network /
- natural language processing /
- adversarial attack /
- defense /
- robustness

HTML全文

参考文献(0)

施引文献(27)

期刊类型引用(10)

1.	何雪锋，周洁，陈德光，廖海. 自然语言处理的深度学习模型综述. 计算机应用与软件. 2025(02): 1-19+101 . 百度学术
2.	吴欢欢，谢瑞麟，乔塬心，陈翔，崔展齐. 基于可解释性分析的深度神经网络优化方法. 计算机研究与发展. 2024(01): 209-220 . 本站查看
3.	桂韬，奚志恒，郑锐，刘勤，马若恬，伍婷，包容，张奇. 基于深度学习的自然语言处理鲁棒性研究综述. 计算机学报. 2024(01): 90-112 . 百度学术
4.	黄云，董天宇. 电力人工智能指标算法模型多场景鲁棒性评价方法. 吉林大学学报(信息科学版). 2024(01): 162-167 . 百度学术
5.	王小萌，张华，丁金扣，王稼慧. 一种随机束搜索文本攻击黑盒算法. 北京邮电大学学报. 2024(02): 24-29 . 百度学术
6.	王春东，孙嘉琪，杨文军. 基于矫正理解的中文文本对抗样本生成方法. 计算机工程. 2023(02): 37-45 . 百度学术
7.	王浩，唐桥虹，唐娜，郝烨，李澍，孟祥峰，李佳戈. 基于神经网络的心电分类算法抗扰性影响分析. 中国医疗设备. 2023(03): 61-65 . 百度学术
8.	刘颖，杨鹏飞，张立军，吴志林，冯元. 前馈神经网络和循环神经网络的鲁棒性验证综述. 软件学报. 2023(07): 3134-3166 . 百度学术
9.	吴舟婷，罗森林. 基于随机掩码和对抗训练的文本隐私保护实验. 实验技术与管理. 2023(08): 72-76 . 百度学术
10.	金志刚，周峻毅，何晓勇. 面向自然语言处理领域的对抗攻击研究与展望. 信息安全研究. 2022(03): 202-211 . 百度学术