基于半监督学习和信息增益率的入侵检测方案

许勐璠; 李兴华; 刘海; 钟成; 马建峰

doi:10.7544/issn1000-1239.2017.20170456

基于半监督学习和信息增益率的入侵检测方案

An Intrusion Detection Scheme Based on Semi-Supervised Learning and Information Gain Ratio

摘要

摘要: 针对现有未知攻击检测方法仅定性选取特征而导致检测精度较低的问题，提出一种基于半监督学习和信息增益率的入侵检测方案.利用目标网络在遭受攻击时反应在底层重要网络流量特征各异的特点，在模型训练阶段，为了克服训练数据集规模有限的问题，采用半监督学习算法利用少量标记数据获得大规模的训练数据集；在模型检测阶段，引入信息增益率定量分析不同特征对检测性能的影响程度，最大程度地保留了特征信息，以提高模型对未知攻击的检测性能.实验结果表明：该方案能够利用少量标记数据定量分析目标网络中未知攻击的重要网络流量特征并进行检测，其针对不同目标网络中未知攻击检测的准确率均达到90%以上.

Abstract: State-of-the-art intrusion detection schemes for unknown attacks employ machine learning techniques to identify anomaly features within network traffic data. However, due to the lack of enough training set, the difficulty of selecting features quantitatively and the dynamic change of unknown attacks, the existing schemes cannot detect unknown attacks effectually. To address this issue, an intrusion detection scheme based on semi-supervised learning and information gain ratio is proposed. In order to overcome the limited problem of training set in the training period, the semi-supervised learning algorithm is used to obtain large-scale training set with a small amount of labelled data. In the detection period, the information gain ratio is introduced to determine the impact of different features and weight voting to infer the final output label to identify unknown attacks adaptively and quantitatively, which can not only retain the information of features at utmost, but also adjust the weight of single decision tree adaptively against dynamic attacks. Extensive experiments indicate that the proposed scheme can quantitatively analyze the important network traffic features of unknown attacks and detect them by using a small amount of labelled data with no less than 91% accuracy and no more than 5% false negative rate, which have obvious advantages over existing schemes.

HTML全文

参考文献(0)

施引文献

资源附件(0)