Abstract:
Anomaly detection aims to identify data that deviates from expected behavior patterns. Despite the potential of semi-supervised anomaly detection methods in enhancing detection accuracy by utilizing a limited amount of labeled data as prior knowledge, the labeled anomalies (i.e., seen anomalies) acquired are unlikely to cover all types of anomalies. In real-world scenarios, novel types of anomalies (i.e., unseen anomalies) often emerge, which may exhibit distinct characteristics from the known anomalies, thereby rendering them challenging to detect using existing semi-supervised anomaly detection methods. To address this issue, we propose a semi-supervised unknown anomaly detection (SSUAD) method, aimed at simultaneously identifying both known and unseen anomalies. This method utilizes a closed-set classifier for the classification of known anomalies and normal instances, and an unknown anomaly detector for the detection of unseen anomalies. Moreover, considering the extreme imbalance between anomalies and normal instances in the anomaly detection scenario, we design an effective data augmentation strategy to increase the number of anomaly samples. Experiments are conducted on UNSW-NB15 and KDDCUP99 datasets, as well as a real-world dataset SQB. The results reveal that, compared with existing anomaly detection methods, SSUAD exhibits significant improvement in the anomaly detection performance metrics AUC-ROC and AUC-PR, thereby verifying the effectiveness and reasonableness of the proposed method.