基于边缘样本的智能网络入侵检测系统数据污染防御方法

刘广睿; 张伟哲; 李欣洁

doi:10.7544/issn1000-1239.20220509

基于边缘样本的智能网络入侵检测系统数据污染防御方法

Data Contamination Defense Method for Intelligent Network Intrusion Detection Systems Based on Edge Examples

摘要

摘要: 人工智能已被广泛应用于网络入侵检测系统.然而由于流量样本存在概念漂移现象，用于恶意流量识别的模型必须频繁更新以适应新的特征分布.更新后模型的有效性依赖新增训练样本的质量，所以防止数据污染尤为重要.然而目前流量样本的污染过滤工作仍依赖专家经验，这导致在模型更新过程中存在样本筛选工作量大、模型准确率不稳定、系统易受投毒攻击等问题.现有工作无法在保证模型性能的同时实现污染过滤或模型修复.为解决上述问题，为智能网络入侵检测系统设计了一套支持污染数据过滤的通用模型更新方法.首先设计了EdgeGAN算法，利用模糊测试使生成对抗网络快速拟合模型边缘样本分布.然后通过检查新增训练样本与原模型的MSE值和更新后模型对旧边缘样本的F\-β分数，识别出污染样本子集.通过让模型学习恶意边缘样本，抑制投毒样本对模型的影响，保证模型在中毒后快速复原.最后通过在5种典型智能网络入侵检测系统上的实验测试，验证了提出的更新方法在污染过滤与模型修复上的有效性.对比现有最先进的方法，新方法对投毒样本的检测率平均提升12.50%，对中毒模型的修复效果平均提升6.38%.该方法适用于保护任意常见智能网络入侵检测系统的更新过程，可减少人工样本筛选工作，有效降低了投毒检测与模型修复的代价，对模型的性能和鲁棒性起到保障作用.新方法也可以用于保护其他相似的智能威胁检测模型.

Abstract: Artificial intelligence has been widely used in network intrusion detection systems. Due to the concept drift of traffic samples, the models used for malicious traffic identification must be updated frequently to adapt to new feature distributions. The effectiveness of the updated model depends on the quality of the new training samples, so it is essential to prevent data contamination. However, contamination filtering of traffic samples still relies on expert experience, which leads to the problems such as the immense workload of sample screening, unstable model accuracy, and vulnerability to poisoning attacks during the model update. Existing works cannot achieve contamination filtering or model repair while maintaining model performance. We design a general model update method for intelligent network intrusion detection systems to solve the above problems. In this paper, we first design the EdgeGAN algorithm to make the generative adversarial network fit the model edge example distribution through fuzzing. Then a subset of contaminated examples is identified by examining the MSE values of the new training samples and the original model and checking the F\-β scores of the updated model on the old edge examples. The influence of poisoned examples is suppressed by letting the model learn malicious edge examples, and the model is guaranteed to recover quickly after poisoning. Finally, the effectiveness of the update method on contamination filtering and model restoration is verified by experimental testing on 5 typical intelligent network intrusion detection systems. Compared with the state-of-the-art methods, the new method improves the detection rate of poisoned examples by 12.50% and the restoration effect of poisoned models by 6.38%. The method is applicable to protect the update process of any common intelligent network intrusion detection systems, which can reduce the manual sample screening work, effectively reduce the cost of poison detection and model repair, and provide guarantees for model performance and robustness. The new method can also protect similar intelligent threat detection models.

HTML全文

参考文献(0)

施引文献

资源附件(0)