Salvaging the Overlooked: Leveraging Class-Aware Contrastive Learning for Multi-Class Anomaly Detection

Abstract

For anomaly detection (AD), early approaches often train separate models for individual classes, yielding high performance but posing challenges in scalability and resource management. Recent efforts have shifted toward training a single model capable of handling multiple classes. However, directly extending early AD methods to multi-class settings often results in degraded performance.

In this paper, we investigate this performance degradation observed in reconstruction-based methods, identifying the key issue: inter-class confusion. This confusion emerges when a model trained in multi-class scenarios incorrectly reconstructs samples from one class as another, thereby exacerbating reconstruction errors. To this end, we propose a simple yet effective modification, called class-aware contrastive learning (CCL). By explicitly leveraging raw object category information (e.g., carpet or wood) as supervised signals, we introduce local CL to refine multiscale dense features, and global CL to obtain more compact feature representations of normal patterns, thereby effectively adapting the models to multi-class settings. Experiments across five datasets validate the effectiveness of our approach, demonstrating significant improvements and superior performance compared to state-of-the-art methods. Notably, ablation studies indicate that pseudo-class labels can achieve comparable performance.

(This is version 2 of the paper, which includes additional experiments and analysis, compared to the first version (V1: Revitalizing Reconstruction Models for Multi-class Anomaly Detection via Class-Aware Contrastive Learning ).)

We focus on the question: "Why do one-for-one models degrade when trained on multiple classes?". Specifically, previous AD methods, such as RD, DeSTSeg, SimpleNet, DRAEM, perform well yet training different models for different categories. We refer to such training strategy one-for-one models, which is challenging to handle due to computational cost and model management. However, when directly trianing these one-for-one models on multiple classes, the performance will decrease significantly.

We found two issues when training one-for-one models on multiple classes:

Catastrophic forgetting: The model struggles to retain previously learned knowledge as new classes are introduced.
Inter-class confusion: The models incorrectly reconstructed an input image of the 'carpet' class as 'wood' or misinterpreted 'transistor' as 'tile'. The model struggled to maintain accurate texture styles, particularly when anomalies exhibited stylistic similarities to other classes.

BibTeX


        # we will update the BibTex citation once the camera-ready version is available
        @article{fan2024revitalizing,
          title={Revitalizing Reconstruction Models for Multi-class Anomaly Detection via Class-Aware Contrastive Learning},
          author={Fan, Lei and Huang, Junjie and Di, Donglin and Su, Anyang and Pagnucco, Maurice and Song, Yang},
          journal={arXiv preprint arXiv:2412.04769},
          year={2024}
        }

Salvaging the Overlooked: Leveraging Class-Aware Contrastive Learning for Multi-Class Anomaly Detection

Abstract

Overview of LGC. Building on existing reconstruction-based models, we employ a projector layer following the encoder and apply both local CL and global CL around the neck to capture more compact feature representations for each class.

Visualization of different models across four datasets: MVTec, VISA, BTAD and Real-IAD under the all-in-all setting.

Comparison of various one-for-all models across four datasets. Domain-in-all refers to combining all classes within each dataset, while All-in-all signifies combining all classes across all datasets. Results are presented as I-AUROC / P-AUROC / PRO (%).

BibTeX