针对目标检测器的假阳性对抗样本

袁小鑫; 胡军; 黄永洪

doi:10.7544/issn1000-1239.20210658

摘要: 目标检测器现已被广泛应用在各类智能系统中，主要用于对图像中的物体进行识别与定位.然而，近年来的研究表明，目标检测器与DNNs分类器都易受数字对抗样本和物理对抗样本的影响.YOLOv3是实时检测任务中一种主流的目标检测器，现有攻击YOLOv3的物理对抗样本的构造方式大多是将生成的较大对抗性扰动打印出来再粘贴在特定类别的物体表面.最近的研究中出现的假阳性对抗样本(false positive adversarial example, FPAE)可通过目标模型直接生成得到，人无法识别出该对抗样本图像中的内容，但目标检测器却以高置信度将其误识别为攻击者指定的目标类.现有以YOLOv3为目标模型生成FPAE的方法仅有AA(appearing attack)方法一种，该方法在生成FPAE的过程中，为提升FPAE的鲁棒性，会在迭代优化过程中加入EOT(expectation over transformation)图像变换来模拟各种物理条件，但是并未考虑拍摄时可能出现的运动模糊(motion blur)情况，进而影响到对抗样本的攻击效果.此外，生成的FPAE在对除YOLOv3外的目标检测器进行黑盒攻击时的攻击成功率并不高.为生成性能更好的FPAE，以揭示现有目标检测器存在的弱点和测试现有目标检测器的安全性，以YOLOv3目标检测器为目标模型，提出RTFP(robust and transferable false positive)对抗攻击方法.该方法在迭代优化过程中，除了加入典型的图像变换外，还新加入了运动模糊变换.同时，在损失函数的设计上，借鉴了C&W攻击中损失函数的设计思想，并将目标模型在FPAE的中心所在的网格预测出的边界框与FPAE所在的真实边界框之间的重合度(intersection over union, IOU)作为预测的边界框的类别损失的权重项.在现实世界中的多角度、多距离拍摄测试以及实际道路上的驾车拍摄测试中，RTFP方法生成的FPAE能够保持较强的鲁棒性且迁移性强于现有方法生成的FPAE.

Abstract: Object detectors have been widely applied in various intelligent systems, and mainly used to classify and locate objects in images. However, recent studies show that object detectors are as susceptible to digital adversarial examples and physical adversarial examples as DNNs classifiers. YOLOv3 is a mainstream object detector used in real-time detection tasks. Most of the existing physical adversarial examples for attacking YOLOv3 are constructed by printing out the large adversarial perturbations and pasting them on the surface of a specific class of object. The false positive adversarial example(FPAE) that appeared in recent research can be directly generated by the target model, which are unrecognizable to humans but can cause the object detectors to recognize them as the target class specified by the attacker with high confidence. The existing method to generate FPAE with YOLOv3 as the target model is only the AA(appearing attack) method. In the process of generating FPAE through the AA method, in order to improve the robustness of FPAE, EOT(expectation over transformation) image transformation will be added in the iterative optimization process to simulate various physical conditions, but motion blur that may occur during shooting is not considered, which in turn affects the attack effect of adversarial examples. In addition, the generated FPAE has a low attack success rate when it performs black-box attack on object detectors other than YOLOv3. In order to generate a better performance FPAE to reveal the weaknesses of existing object detectors and test the security of existing object detectors, we take the YOLOv3 object detector as the target model, and propose the RTFP(robust and transferable false positive) adversarial attack method. In the iterative optimization process of this method, in addition to using typical image transformation, motion blur transformation is added. At the same time, in the design of the loss function, this method draws on the design idea of the loss function in the C&W attack, and takes the IOU(intersection over union) between the bounding boxes predicted by the target model in the grid cell where the center of the FPAE is located and the real bounding box where the FPAE is located as the weight item of the classification loss of the predicted bounding boxes. In the experiment of the multiple distance-angle combinations shooting tests and the driving shooting tests in the real world, the FPAE generated by RTFP method can maintain good robustness, and its transferability is better than the FPAE generated by the existing methods.

针对目标检测器的假阳性对抗样本

False Positive Adversarial Example Against Object Detectors