Abstract:
In recent years, the rapid development of artificial intelligence technology, particularly deep learning, has led to its widespread application in various fields such as computer vision and natural language processing. However, recent research indicates potential security risks associated with these advanced AI models could compromise their reliability. In light of this concern, this survey delves into cutting-edge research findings pertaining to security attacks, attack detection, and defense strategies for artificial intelligence models. Specifically regarding model security attacks, our work focuses on elucidating the principles and technical status of adversarial attacks, model inversion attacks, and model theft attacks. With regards to model attack detection methods explored in this paper, they include defensive distillation techniques, regularization approaches, outlier detection, robust statistics. As for model defense strategies examined in this study, they encompass adversarial training measures, model structure defense mechanisms, query control defenses along with other technical means. This comprehensive survey not only summarizes but also expands upon techniques and methodologies relevant to ensuring the security of artificial intelligence models thereby providing a solid theoretical foundation for their secure applications while simultaneously enabling researchers to gain a better understanding of the current state-of-the-art research within this field facilitating informed decisions when selecting future research directions.