• 中国精品科技期刊
  • CCF推荐A类中文期刊
  • 计算领域高质量科技期刊T1类
高级检索

基于散度的网络流概念漂移分类方法

程光, 钱德鑫, 郭建伟, 史海滨, 吴桦, 赵玉宇

程光, 钱德鑫, 郭建伟, 史海滨, 吴桦, 赵玉宇. 基于散度的网络流概念漂移分类方法[J]. 计算机研究与发展, 2020, 57(12): 2673-2682. DOI: 10.7544/issn1000-1239.2020.20190691
引用本文: 程光, 钱德鑫, 郭建伟, 史海滨, 吴桦, 赵玉宇. 基于散度的网络流概念漂移分类方法[J]. 计算机研究与发展, 2020, 57(12): 2673-2682. DOI: 10.7544/issn1000-1239.2020.20190691
Cheng Guang, Qian Dexin, Guo Jianwei, Shi Haibin, Hua, Zhao Yuyu. A Classification Approach Based on Divergence for Network Traffic in Presence of Concept Drift[J]. Journal of Computer Research and Development, 2020, 57(12): 2673-2682. DOI: 10.7544/issn1000-1239.2020.20190691
Citation: Cheng Guang, Qian Dexin, Guo Jianwei, Shi Haibin, Hua, Zhao Yuyu. A Classification Approach Based on Divergence for Network Traffic in Presence of Concept Drift[J]. Journal of Computer Research and Development, 2020, 57(12): 2673-2682. DOI: 10.7544/issn1000-1239.2020.20190691
程光, 钱德鑫, 郭建伟, 史海滨, 吴桦, 赵玉宇. 基于散度的网络流概念漂移分类方法[J]. 计算机研究与发展, 2020, 57(12): 2673-2682. CSTR: 32373.14.issn1000-1239.2020.20190691
引用本文: 程光, 钱德鑫, 郭建伟, 史海滨, 吴桦, 赵玉宇. 基于散度的网络流概念漂移分类方法[J]. 计算机研究与发展, 2020, 57(12): 2673-2682. CSTR: 32373.14.issn1000-1239.2020.20190691
Cheng Guang, Qian Dexin, Guo Jianwei, Shi Haibin, Hua, Zhao Yuyu. A Classification Approach Based on Divergence for Network Traffic in Presence of Concept Drift[J]. Journal of Computer Research and Development, 2020, 57(12): 2673-2682. CSTR: 32373.14.issn1000-1239.2020.20190691
Citation: Cheng Guang, Qian Dexin, Guo Jianwei, Shi Haibin, Hua, Zhao Yuyu. A Classification Approach Based on Divergence for Network Traffic in Presence of Concept Drift[J]. Journal of Computer Research and Development, 2020, 57(12): 2673-2682. CSTR: 32373.14.issn1000-1239.2020.20190691

基于散度的网络流概念漂移分类方法

基金项目: 国家重点研发计划项目(2018YFB1800602,2017YFB0801703);教育部-中国移动科研基金项目(MCM20180506);国家自然科学基金项目(61602114);赛尔网络下一代互联网技术创新项目(NGIICS20190101,NGII20170406)
详细信息
  • 中图分类号: TP393

A Classification Approach Based on Divergence for Network Traffic in Presence of Concept Drift

Funds: This work was supported by the National Key Research and Development Program of China (2018YFB1800602, 2017YFB0801703), the Ministry of Education-China Mobile Research Fund Project (MCM20180506), the National Natural Science Foundation of China (61602114), and the CERNET Innovation Project (NGIICS20190101, NGII20170406).
  • 摘要: 网络流量特征分布的动态变化产生概念漂移问题,造成基于机器学习的网络流量分类模型精度下降.定期更新分类模型耗时且无法保证分类模型的泛化能力.基于此,提出一种基于散度的网络流概念漂移分类方法(ensemble classification based on divergence detection, ECDD),采用双层窗口机制,从信息熵的角度出发,根据流量特征分布的JS散度,记为JSD(Jensen-Shannon divergence)来度量滑动窗口内数据分布的差异,从而检测概念漂移.借鉴增量集成学习的思想,检测到漂移时对于新样本重新训练出新的分类器,之后通过分类器权值排序,保留性能较高的分类器,加权集成分类结果对样本进行分类.抓取常见的网络应用流量,根据应用特征分布的不同构建概念漂移数据集,将该方法与常见的概念漂移检测方法进行实验对比,实验结果表明:该方法可以有效地检测概念漂移和更新分类器,表现出较好的分类性能.
    Abstract: Due to the high dynamic variability, suddenness and irreversibility of network traffic, the statistical characteristics and distribution of traffic may change dynamically, resulting in a concept drift problem based on the flow-based machine learning method. The problem of concept drift makes the classification model based on the original data set worse on the new sample, which causes the classification accuracy to decrease. Based on this, a classification approach based on divergence for network traffic in presence of concept drift, named ECDD (ensemble classification based on divergence detection) is proposed. The method uses a double-layer window mechanism to track the concept drift. From the perspective of information entropy, the Jensen-Shannon divergence is used to measure the difference of data distribution between old and new windows, so as to effectively detect the concept drift. This paper draws on the idea of incremental ensemble learning, trains a new classifier on the concept drift traffic based on the pre-retention classifier, and replaces the classifier with the original performance degradation according to the classifier weight, so that the ensemble classifier is effectively updated. For common network application traffic, this paper constructs a concept drift data set according to different application feature distributions. This paper compares the method with common concept drift detection methods and the experimental results show that the method can effectively detect concept drift and update the classifier, showing better classification performance.
  • 期刊类型引用(9)

    1. 臧洁,任旭,冯艳爽,王妍,肖萍,鲁锦涛. 一种干扰系数自探测的网络事件选取方法. 小型微型计算机系统. 2024(03): 763-768 . 百度学术
    2. 路苗,门可,马永红,张海瑞,冯彦成. 基于SIS模型的群体社交网络舆情演化仿真. 吉林大学学报(信息科学版). 2023(01): 106-111 . 百度学术
    3. 马帅,刘建伟,左信. 图神经网络综述. 计算机研究与发展. 2022(01): 47-80 . 本站查看
    4. 夏一雪,张立红,何巍,张双狮. 自治线性风险作用下网络舆情演化建模与仿真研究. 情报杂志. 2022(05): 92-98 . 百度学术
    5. 易杰,曹腾飞,黄明峰,黄肖翰,张子震. 基于时间编码LSTM的高校舆情热点趋势预测研究. 大数据. 2022(05): 124-138 . 百度学术
    6. 张杨,廉吉庆,张扬,高德毅. 国内网络舆情情感研究热点分析. 网络安全与数据治理. 2022(07): 47-55 . 百度学术
    7. 徐缤荣. 融媒体背景下社会热点新闻舆情传播控制模型构建. 微型电脑应用. 2022(10): 149-152 . 百度学术
    8. 臧洁,任旭. 考虑兴趣偏好和多事件影响的网络事件推演模型研究. 辽宁大学学报(自然科学版). 2022(04): 298-306 . 百度学术
    9. 赵剑,董文华,史丽娟,匡哲君,毕京晓,王晢宇,强文倩. 针对突发公共事件的舆情监测与可视化分析. 吉林大学学报(信息科学版). 2021(06): 712-719 . 百度学术

    其他类型引用(5)

计量
  • 文章访问数:  960
  • HTML全文浏览量:  3
  • PDF下载量:  321
  • 被引次数: 14
出版历程
  • 发布日期:  2020-11-30

目录

    /

    返回文章
    返回