ISSN 1000-1239 CN 11-1777/TP

计算机研究与发展 ›› 2019, Vol. 56 ›› Issue (3): 566-575.doi: 10.7544/issn1000-1239.2019.20180063

• 信息安全 • 上一篇    下一篇

基于KNN离群点检测和随机森林的多层入侵检测方法

任家东1,2,刘新倩1,2,王倩1,2,何海涛1,2,赵小林3,4   

  1. 1(燕山大学信息科学与工程学院 河北秦皇岛 066001); 2(河北省软件工程重点实验室(燕山大学) 河北秦皇岛 066001); 3(北京理工大学软件学院 北京 100081); 4(软件安全工程技术北京市重点实验室(北京理工大学) 北京 100081) (jdren@ysu.edu.cn)
  • 出版日期: 2019-03-01
  • 基金资助: 
    国家重点研发计划基金项目(2016YFB0800700);国家自然科学基金项目(61472341,61772449,61572420);河北省自然科学基金项目(F2016203330, F2015203326);燕山大学博士后科研择优资助项目(B2017003005);燕山大学博士基金项目(B1036)

An Multi-Level Intrusion Detection Method Based on KNN Outlier Detection and Random Forests

Ren Jiadong1,2, Liu Xinqian1,2, Wang Qian1,2, He Haitao1,2, Zhao Xiaolin3,4   

  1. 1(School of Information Science and Engineering, Yanshan University, Qinhuangdao, Hebei 066001); 2(Hebei Key Laboratory of Software Engineering (Yanshan University), Qinhuangdao, Hebei 066001); 3(School of Software, Beijing Institute of Technology, Beijing 100081); 4(Beijing Key Laboratory of Software Security Engineering Technology (Beijing Institute of Technology), Beijing 100081)
  • Online: 2019-03-01

摘要: 入侵检测系统能够有效地检测网络中异常的攻击行为,对网络安全至关重要.目前,许多入侵检测方法对攻击行为Probe(probing),U2R(user to root),R2L(remote to local)的检测率比较低.基于这一问题,提出一种新的混合多层次入侵检测模型,检测正常和异常的网络行为.该模型首先应用KNN(K nearest neighbors)离群点检测算法来检测并删除离群数据,从而得到一个小规模和高质量的训练数据集;接下来,结合网络流量的相似性,提出一种类别检测划分方法,该方法避免了异常行为在检测过程中的相互干扰,尤其是对小流量攻击行为的检测;结合这种划分方法,构建多层次的随机森林模型来检测网络异常行为,提高了网络攻击行为的检测效果.流行的数据集KDD(knowledge discovery and data mining) Cup 1999被用来评估所提出的模型.通过与其他算法进行对比,该方法的准确率和检测率要明显优于其他算法,并且能有效地检测Probe,U2R,R2L这3种攻击类型.

关键词: 网络安全, 入侵检测系统, KNN离群点检测, 随机森林模型, 多层次

Abstract: Intrusion detection system can efficiently detect attack behaviors, which will do great damage for network security. Currently many intrusion detection systems have low detection rates in these abnormal behaviors Probe (probing), U2R (user to root) and R2L (remote to local). Focusing on this weakness, a new hybrid multi-level intrusion detection method is proposed to identify network data as normal or abnormal behaviors. This method contains KNN (K nearest neighbors) outlier detection algorithm and multi-level random forests (RF) model, called KNN-RF. Firstly KNN outlier detection algorithm is applied to detect and delete outliers in each category and get a small high-quality training dataset. Then according to the similarity of network traffic, a new method of the division of data categories is put forward and this division method can avoid the mutual interference of anomaly behaviors in the detection process, especially for the detecting of the attack behaviors of small traffic. Based on this division, a multi-level random forests model is constructed to detect network abnormal behaviors and improve the efficiency of detecting known and unknown attacks. The popular KDD (knowledge discovery and data mining) Cup 1999 dataset is used to evaluate the performance of the proposed method. Compared with other algorithms, the proposed method is significantly superior to other algorithms in accuracy and detection rate, and can detect Probe, U2R and R2L effectively.

Key words: network security, intrusion detection system, KNN outlier detection, random forests model, multi-level

中图分类号: