高级检索
    许勐璠, 李兴华, 刘海, 钟成, 马建峰. 基于半监督学习和信息增益率的入侵检测方案[J]. 计算机研究与发展, 2017, 54(10): 2255-2267. DOI: 10.7544/issn1000-1239.2017.20170456
    引用本文: 许勐璠, 李兴华, 刘海, 钟成, 马建峰. 基于半监督学习和信息增益率的入侵检测方案[J]. 计算机研究与发展, 2017, 54(10): 2255-2267. DOI: 10.7544/issn1000-1239.2017.20170456
    Xu Mengfan, Li Xinghua, Liu Hai, Zhong Cheng, Ma Jianfeng. An Intrusion Detection Scheme Based on Semi-Supervised Learning and Information Gain Ratio[J]. Journal of Computer Research and Development, 2017, 54(10): 2255-2267. DOI: 10.7544/issn1000-1239.2017.20170456
    Citation: Xu Mengfan, Li Xinghua, Liu Hai, Zhong Cheng, Ma Jianfeng. An Intrusion Detection Scheme Based on Semi-Supervised Learning and Information Gain Ratio[J]. Journal of Computer Research and Development, 2017, 54(10): 2255-2267. DOI: 10.7544/issn1000-1239.2017.20170456

    基于半监督学习和信息增益率的入侵检测方案

    An Intrusion Detection Scheme Based on Semi-Supervised Learning and Information Gain Ratio

    • 摘要: 针对现有未知攻击检测方法仅定性选取特征而导致检测精度较低的问题,提出一种基于半监督学习和信息增益率的入侵检测方案.利用目标网络在遭受攻击时反应在底层重要网络流量特征各异的特点,在模型训练阶段,为了克服训练数据集规模有限的问题,采用半监督学习算法利用少量标记数据获得大规模的训练数据集;在模型检测阶段,引入信息增益率定量分析不同特征对检测性能的影响程度,最大程度地保留了特征信息,以提高模型对未知攻击的检测性能.实验结果表明:该方案能够利用少量标记数据定量分析目标网络中未知攻击的重要网络流量特征并进行检测,其针对不同目标网络中未知攻击检测的准确率均达到90%以上.

       

      Abstract: State-of-the-art intrusion detection schemes for unknown attacks employ machine learning techniques to identify anomaly features within network traffic data. However, due to the lack of enough training set, the difficulty of selecting features quantitatively and the dynamic change of unknown attacks, the existing schemes cannot detect unknown attacks effectually. To address this issue, an intrusion detection scheme based on semi-supervised learning and information gain ratio is proposed. In order to overcome the limited problem of training set in the training period, the semi-supervised learning algorithm is used to obtain large-scale training set with a small amount of labelled data. In the detection period, the information gain ratio is introduced to determine the impact of different features and weight voting to infer the final output label to identify unknown attacks adaptively and quantitatively, which can not only retain the information of features at utmost, but also adjust the weight of single decision tree adaptively against dynamic attacks. Extensive experiments indicate that the proposed scheme can quantitatively analyze the important network traffic features of unknown attacks and detect them by using a small amount of labelled data with no less than 91% accuracy and no more than 5% false negative rate, which have obvious advantages over existing schemes.

       

    /

    返回文章
    返回