基于半监督学习和信息增益率的入侵检测方案

许勐璠; 李兴华; 刘海; 钟成; 马建峰

doi:10.7544/issn1000-1239.2017.20170456

基于半监督学习和信息增益率的入侵检测方案

(西安电子科技大学网络与信息安全学院西安 710071) (812455541@qq.com)

基金项目: 国家自然科学基金项目(U170820014,61372075,U1135002,61672408)

详细信息

中图分类号: TP309.7
计量
- 文章访问数: 1486
- HTML全文浏览量: 1
- PDF下载量: 1305
出版历程
- 发布日期: 2017-09-30

An Intrusion Detection Scheme Based on Semi-Supervised Learning and Information Gain Ratio

(School of Cyber Engineering, Xidian Universality, Xi’an 710071)

摘要

摘要: 针对现有未知攻击检测方法仅定性选取特征而导致检测精度较低的问题，提出一种基于半监督学习和信息增益率的入侵检测方案.利用目标网络在遭受攻击时反应在底层重要网络流量特征各异的特点，在模型训练阶段，为了克服训练数据集规模有限的问题，采用半监督学习算法利用少量标记数据获得大规模的训练数据集；在模型检测阶段，引入信息增益率定量分析不同特征对检测性能的影响程度，最大程度地保留了特征信息，以提高模型对未知攻击的检测性能.实验结果表明：该方案能够利用少量标记数据定量分析目标网络中未知攻击的重要网络流量特征并进行检测，其针对不同目标网络中未知攻击检测的准确率均达到90%以上.
- 入侵检测 /
- 未知攻击 /
- 特征选取 /
- 半监督学习 /
- 信息增益率
Abstract: State-of-the-art intrusion detection schemes for unknown attacks employ machine learning techniques to identify anomaly features within network traffic data. However, due to the lack of enough training set, the difficulty of selecting features quantitatively and the dynamic change of unknown attacks, the existing schemes cannot detect unknown attacks effectually. To address this issue, an intrusion detection scheme based on semi-supervised learning and information gain ratio is proposed. In order to overcome the limited problem of training set in the training period, the semi-supervised learning algorithm is used to obtain large-scale training set with a small amount of labelled data. In the detection period, the information gain ratio is introduced to determine the impact of different features and weight voting to infer the final output label to identify unknown attacks adaptively and quantitatively, which can not only retain the information of features at utmost, but also adjust the weight of single decision tree adaptively against dynamic attacks. Extensive experiments indicate that the proposed scheme can quantitatively analyze the important network traffic features of unknown attacks and detect them by using a small amount of labelled data with no less than 91% accuracy and no more than 5% false negative rate, which have obvious advantages over existing schemes.
- intrusion detection /
- unknown attacks /
- feature selection /
- semi-supervised learning /
- information gain ratio

HTML全文

参考文献(0)

施引文献(9)

期刊类型引用(4)

1.	薛万利，张智彬，裴生雷，张开华，陈胜勇. 混合目标与搜索区域令牌的视觉目标跟踪. 计算机研究与发展. 2024(02): 460-469 . 本站查看
2.	姜文涛，崔江磊. 旋转区域提议网络的孪生神经网络跟踪算法. 计算机工程与应用. 2022(24): 247-255 . 百度学术
3.	谭建豪，张思远. 基于自适应空间正则化的视觉目标跟踪算法. 计算机研究与发展. 2021(02): 427-435 . 本站查看
4.	朱洪波. 自适应模型的视觉跟踪算法. 计算机与数字工程. 2020(12): 2991-2996 . 百度学术