多层次知识自蒸馏联合多步骤训练的细粒度图像识别

余鹰; 危伟; 汤洪; 钱进

doi:10.7544/issn1000-1239.202330262

多层次知识自蒸馏联合多步骤训练的细粒度图像识别

Multi-Stage Training with Multi-Level Knowledge Self-Distillation for Fine-Grained Image Recognition

摘要

摘要: 细粒度图像识别具有类内差异大、类间差异小的特点，在智能零售、生物多样性检测和智慧交通等领域中有着广阔的应用场景. 提取到判别性强的多粒度特征是提升细粒度图像识别精度的关键，而已有工作大多只在单一层次进行知识获取，忽略了多层次信息交互对于提取鲁棒性特征的有效性. 另外一些工作通过引入注意力机制来找到局部判别区域，但这不可避免地增加了网络复杂度. 为了解决这些问题，提出了多层次知识自蒸馏联合多步骤训练的细粒度图像识别（multi-level knowledge self-distillation with multi-step training for fine-grained image recognition, MKSMT）模型. 该模型首先在网络浅层进行特征学习，然后在深层网络再次进行特征学习，并利用知识自蒸馏将深层网络知识迁移至浅层网络中，优化后的浅层网络又能帮助深层网络提取到更鲁棒的特征，进而提高整个模型的性能. 实验结果表明，MKSMT在CUB-200-2011、NA-Birds和Stanford Dogs 这3个公开细粒度图像数据集上分别达到了92.8%、92.6%和91.1%的分类准确度，性能优于当前大部分细粒度识别算法.

Abstract: Fine-grained image recognition is characterized by large intra-class variation and small inter-class variation, with wide applications in intelligent retail, biodiversity protection, and intelligent transportation. Extracting discriminative multi-granularity features is the key to improve the accuracy of fine-grained image recognition. Most of existing methods only perform knowledge acquisition at a single level, ignoring the effectiveness of multi-level information interaction for extracting robust features. The other work introduces attention mechanisms to locate discriminative local regions to extract discriminative features, but this inevitably increases the network complexity. To address these issues, a MKSMT (multi-level knowledge self-distillation with multi-step training) model for fine-grained image recognition is proposed. The model first learns features in the shallow network, then performs feature learning in the deep network, and uses self-distillation to transfer knowledge from the deep network to the shallow network. The optimized shallow network can help the deep network extract more robust features, thus improving the performance of the whole model. Experimental results show that MKSMT achieves classification accuracy of 92.8%, 92.6%, and 91.1% on three publicly available fine-grained image datasets, respectively, outperforming most state-of-the-art fine-grained recognition algorithms.

HTML全文

参考文献(42)

施引文献

资源附件(0)