基于跨标签分配知识蒸馏的端到端目标检测方法

胡志良; 陈思; 王大寒; 严严

doi:10.7544/issn1000-1239.202550737

基于跨标签分配知识蒸馏的端到端目标检测方法

Cross-Label Assignment Knowledge Distillation for End-to-End Object Detection

摘要

摘要: 最近，一对一标签分配规则在去除后处理步骤中的非极大值抑制（Non-Maximum Suppression，NMS）、构建无NMS的端到端检测范式中发挥了巨大作用．然而，这种严格的样本匹配规则导致训练过程中正样本数量显著减少，使得模型在特征表示学习阶段难以充分挖掘数据中的潜在语义信息，学习效率低下．为此，本文提出了一种跨标签分配知识蒸馏（Cross-Label Assignment Knowledge Distillation，CAKD）方法，用于实现端到端的目标检测．该方法将知识蒸馏的迁移学习机制与端到端目标检测算法相结合以弥补现有一对一标签分配规则的缺陷，为知识蒸馏框架用在训练端到端检测模型建立了一条有效的知识迁移路径．具体地，该方法首先将学生模型的多尺度特征传递到教师模型的检测头中．然后，将学生-教师跨标签分配预测与教师的预测计算蒸馏损失．这种方式避免了直接强制学生模仿教师的特征导致的特征混乱、语义错位的问题．此外，我们设计了一种有效的任务感知匹配度量方式，综合考虑了分类和回归质量，避免了分类和定位任务之间没有相关性的问题．我们在COCO数据集上进行了大量实验证明了本文提出方法的有效性．本文提出的跨标签分配知识蒸馏方法在非端到端FCOS基线模型上的均值平均精度（mean Average Precision，mAP）达到了38.8%的无NMS检测性能，其相较于原始有NMS的检测性能实现了2.1%的精度增益．

Abstract: Recently, the one-to-one label assignment rule has played a great role in removing non-maximum suppression (NMS) in the post-processing step and building an NMS-free end-to-end detection paradigm．However，this strict sample matching rule leads to a significant reduction in the number of positive samples during the training process，making it difficult for the model to fully mine the latent semantic information in the data during the feature representation learning stage, resulting in low learning efficiency. To this end, this paper proposes a cross-label assignment knowledge distillation (CAKD) method for end-to-end object detection. This method combines the transfer learning mechanism of knowledge distillation with the end-to-end object detection algorithm to make up for the defects of the existing one-to-one label assignment rule and establishes an effective knowledge transfer path for the knowledge distillation framework to train end-to-end detection models. Specifically, the multi-scale features of the student model are first transferred to the detection head of the teacher model across models. Then, the student-teacher cross-label assignment predictions are compared with the teacher's predictions to calculate the distillation loss. This method avoids directly forcing the student to imitate the teacher's features, which leads to feature confusion and semantic misalignment. In addition, we also design an effective task-aware matching metric that comprehensively considers the quality of classification and regression, thus avoiding the problem of irrelevance between classification and localization tasks. We conduct a large number of experiments on the COCO dataset to demonstrate the effectiveness of this method. By applying the cross-label assignment knowledge distillation method proposed in this paper to the non-end-to-end FCOS baseline model，it achieves a 38.8% non-NMS detection performance in terms of mean average precision (mAP)，which is a 2.1% accuracy gain compared to the original detection performance with NMS.

HTML全文

参考文献(0)

施引文献

资源附件(0)