DiffAD: Difference Convolution Attention-Based Multi-Class Unsupervised Anomaly Detection
-
-
Abstract
A novel unsupervised multi-class anomaly detection model, termed DiffAD, is proposed to tackle the critical challenges of annotation scarcity and insufficient detection accuracy plaguing visual anomaly detection tasks in complex industrial scenarios. The model employs a progressive feature reconstruction strategy, with its core innovation centered on a dedicated feature reconstruction component named DADE. Breaking new ground, DADE decomposes the feature reconstruction process into three sequential phases, namely chaos, pre-refinement and strong refinement, which synergistically enhance the quality of reconstructed features and ensure the stability of the entire reconstruction workflow. A key standout of DADE lies in its seamless integration of difference convolution attention and a detail enhancement mechanism. It combines difference convolution with multi-head self-attention and incorporates residual dense connections, thereby enabling the component to markedly strengthen the model’s ability to capture subtle image variations and high-frequency details that are critical for anomaly identification and substantially elevate the precision of pixel-level anomaly localization. Extensive empirical evaluations are conducted on four representative benchmark datasets, including MVTec-AD, VisA, MVTec-3D and Uni-Medical. The experimental results consistently demonstrate that DiffAD achieves significant performance superiority over state-of-the-art mainstream models across both image-level and pixel-level anomaly detection metrics, fully underscoring its remarkable practical application value and great potential for widespread deployment in the domain of unsupervised visual anomaly detection.
-
-