高级检索

    面向轻量级模型鲁棒性的对抗排序蒸馏方法

    Improving Lightweight Model Robustness via Adversarial Ranking Distillation

    • 摘要: 随着轻量级模型在边缘计算等资源受限场景中的广泛应用,其对抗鲁棒性不足的问题日益凸显。对抗蒸馏作为提升轻量级模型鲁棒性的重要手段,虽然现有方法普遍采用基于KL(Kullback-Leibler)散度的硬性约束实现鲁棒知识传递,但仍面临知识建模不完善和传递效率受限等问题。为此,该研究提出一种对抗排序蒸馏方法,通过优先级排序约束机制与多层次一致性框架,显著提升轻量级模型面向对抗攻击的抵御能力。具体地,前者旨在将视觉模型基于对抗样本的输出元素按照重要性排序,进而约束教师-学生模型间的元素优先级保持一致,并通过函数近似非连续排序过程实现可导的松耦合约束。在此基础上,多层次排序一致性蒸馏框架从类别语意关联、样本语意关联以及对抗差异关联3个方面对鲁棒性进行建模与知识迁移,实现教师对抗防御能力的多视角传递,协同提升轻量级模型的干净样本准确率与对抗鲁棒性。在CIFAR-10,CIFAR-100,Tiny-ImageNet数据集上实验验证了该方法的有效性,且相比于现有的先进对抗蒸馏方法有显著性能优势。同时,对抗排序蒸馏方法在训练数据受限和黑盒攻击场景下展现出更强的适应性与泛化能力,为轻量级模型在边缘部署中的安全防御提供了高效解决方案。

       

      Abstract: With the widespread application of lightweight models in resource-constrained scenarios such as edge computing, their insufficient adversarial robustness has become increasingly prominent. While adversarial distillation serves as a primary approach to enhance lightweight models’ robustness, existing methods predominantly employ KL divergence-based rigid constraints for robust knowledge transfer, which still suffer from flawed knowledge modeling and limited transfer efficiency. To address these issues, this paper proposes an Adversarial Ranking Distillation (ARD) method that significantly improves lightweight models’ robustness through a priority ranking constraint mechanism and a multi-level consistency framework. Specifically, the priority ranking constraint mechanism organizes the output elements of vision models on adversarial samples by importance ranking, enforces teacher-student ranking consistency, and implements differentiable loose coupling constraints by approximating discontinuous ranking processes through hyperbolic tangent functions. Building upon this, the multi-level ranking consistency distillation framework models and transfers robustness knowledge from three perspectives: categorical semantic correlation, sample semantic correlation, and adversarial discrepancy correlation, enabling multi-perspective transmission of teachers’ adversarial defense capabilities while synergistically improving lightweight models’ clean sample accuracy and adversarial robustness. Extensive experiments on CIFAR-10, CIFAR-100, and Tiny-ImageNet datasets validate the effectiveness of our method, demonstrating substantial performance advantages over state-of-the-art adversarial distillation approaches. Furthermore, the proposed ARD exhibits enhanced adaptability and generalization capabilities under scenarios of limited training data and black-box attacks, providing an efficient security solution for lightweight model deployment in edge computing environments.

       

    /

    返回文章
    返回