

    Survey of Security Attack and Defense Strategies for Artificial Intelligence Model

    • 摘要: 近年来,以深度学习为代表的人工智能技术发展迅速,在计算机视觉、自然语言处理等多个领域得到广泛应用. 然而,最新研究表明这些先进的人工智能模型存在潜在的安全隐患,可能影响人工智能技术应用的可靠性. 为此,深入调研了面向人工智能模型的安全攻击、攻击检测以及防御策略领域中前沿的研究成果. 在模型安全攻击方面,聚焦于对抗性攻击、模型反演攻击、模型窃取攻击等方面的原理和技术现状;在模型攻击检测方面,聚焦于防御性蒸馏、正则化、异常值检测、鲁棒统计等检测方法;在模型防御策略方面,聚焦于对抗训练、模型结构防御、查询控制防御等技术手段. 概括并扩展了人工智能模型安全相关的技术和方法,为模型的安全应用提供了理论支持. 此外,还使研究人员能够更好地理解该领域的当前研究现状,并选择适当的未来研究方向.


      Abstract: In recent years, the rapid development of artificial intelligence technology, particularly deep learning, has led to its widespread application in various fields such as computer vision and natural language processing. However, recent research indicates potential security risks associated with these advanced AI models could compromise their reliability. In light of this concern, this survey delves into cutting-edge research findings pertaining to security attacks, attack detection, and defense strategies for artificial intelligence models. Specifically regarding model security attacks, our work focuses on elucidating the principles and technical status of adversarial attacks, model inversion attacks, and model theft attacks. With regards to model attack detection methods explored in this paper, they include defensive distillation techniques, regularization approaches, outlier detection, robust statistics. As for model defense strategies examined in this study, they encompass adversarial training measures, model structure defense mechanisms, query control defenses along with other technical means. This comprehensive survey not only summarizes but also expands upon techniques and methodologies relevant to ensuring the security of artificial intelligence models thereby providing a solid theoretical foundation for their secure applications while simultaneously enabling researchers to gain a better understanding of the current state-of-the-art research within this field facilitating informed decisions when selecting future research directions.


