• 中国精品科技期刊
  • CCF推荐A类中文期刊
  • 计算领域高质量科技期刊T1类
高级检索

面向概念漂移且不平衡数据流的G-mean加权分类方法

梁斌, 李光辉, 代成龙

梁斌, 李光辉, 代成龙. 面向概念漂移且不平衡数据流的G-mean加权分类方法[J]. 计算机研究与发展, 2022, 59(12): 2844-2857. DOI: 10.7544/issn1000-1239.20210471
引用本文: 梁斌, 李光辉, 代成龙. 面向概念漂移且不平衡数据流的G-mean加权分类方法[J]. 计算机研究与发展, 2022, 59(12): 2844-2857. DOI: 10.7544/issn1000-1239.20210471
Liang Bin, Li Guanghui, Dai Chenglong. G-mean Weighted Classification Method for Imbalanced Data Stream with Concept Drift[J]. Journal of Computer Research and Development, 2022, 59(12): 2844-2857. DOI: 10.7544/issn1000-1239.20210471
Citation: Liang Bin, Li Guanghui, Dai Chenglong. G-mean Weighted Classification Method for Imbalanced Data Stream with Concept Drift[J]. Journal of Computer Research and Development, 2022, 59(12): 2844-2857. DOI: 10.7544/issn1000-1239.20210471
梁斌, 李光辉, 代成龙. 面向概念漂移且不平衡数据流的G-mean加权分类方法[J]. 计算机研究与发展, 2022, 59(12): 2844-2857. CSTR: 32373.14.issn1000-1239.20210471
引用本文: 梁斌, 李光辉, 代成龙. 面向概念漂移且不平衡数据流的G-mean加权分类方法[J]. 计算机研究与发展, 2022, 59(12): 2844-2857. CSTR: 32373.14.issn1000-1239.20210471
Liang Bin, Li Guanghui, Dai Chenglong. G-mean Weighted Classification Method for Imbalanced Data Stream with Concept Drift[J]. Journal of Computer Research and Development, 2022, 59(12): 2844-2857. CSTR: 32373.14.issn1000-1239.20210471
Citation: Liang Bin, Li Guanghui, Dai Chenglong. G-mean Weighted Classification Method for Imbalanced Data Stream with Concept Drift[J]. Journal of Computer Research and Development, 2022, 59(12): 2844-2857. CSTR: 32373.14.issn1000-1239.20210471

面向概念漂移且不平衡数据流的G-mean加权分类方法

基金项目: 国家自然科学基金项目(62072216)
详细信息
  • 中图分类号: TP391

G-mean Weighted Classification Method for Imbalanced Data Stream with Concept Drift

Funds: This work was supported by the National Natural Science Foundation of China (62072216).
  • 摘要: 数据流中的概念漂移和类别不平衡问题会严重影响数据流分类算法的性能和稳定性.针对二分类数据流中概念漂移和类别不平衡的问题,在基于数据块的集成分类方法上引入成员分类器权重的在线更新机制,结合重采样和自适应滑动窗口技术,提出了一种基于G-mean加权的不平衡数据流在线分类方法(online G-mean update ensemble for imbalance learning, OGUEIL).该方法基于集成学习框架,利用时间衰减因子增量计算成员分类器最近若干实例上的G-mean性能,并确定成员分类器权重,每到达一个新实例,在线更新所有成员分类器及其权重,并对少类实例进行随机过采样.同时,OGUEIL会周期性地根据当前数据构造类别平衡数据集训练新的候选分类器,并选择性地添加至集成框架中.在真实和人工数据集上的结果表明,所提方法的综合性能优于其他同类方法.
    Abstract: Concept drift and class imbalance in data stream seriously degrade the performance and stability of the traditional data stream classification algorithms. To solve this issue in binary classification of data stream, an online G-mean weighted ensemble classification method for imbalanced data stream with concept drift termed OGUEIL is proposed. It exploits the online update mechanism of component classifiers’ weights to modify block-based ensemble algorithms, combining the hybrid resampling and adaptive sliding window algorithm. OGUEIL is based on the ensemble learning framework that once a new instance reaches, each component classifier in the ensemble and its weight are correspondingly updated online, and the minority class instance is randomly oversampled at the same time. Particularly, each component classifier determines its weight according to the G-mean performance on several recently incoming instances, where G-mean of each component classifier is calculated based on the time decay factor increment. At the same time, OGUEIL periodically constructs a balanced dataset according to the data in the current sliding window and trains a new candidate classifier, then adds it to the ensemble based on specific conditions. The experimental results on both real-world and synthesized datasets show that the comprehensive performance of the proposed method outperforms other baseline algorithms.
  • 期刊类型引用(3)

    1. 孙剑明,赵梦鑫. 边缘计算下差分隐私的应用研究综述. 计算机科学. 2024(S1): 896-904 . 百度学术
    2. 张帅,陈建广,陈锐志,汪云甲,黄风华,李召洋. 基于MIMU与Wi-Fi的普适室内定位方法综述. 导航定位与授时. 2024(05): 1-16 . 百度学术
    3. 张学军,席阿友,加小红,张斌,李梅,杜晓刚,黄海燕. 基于深度学习的指纹室内定位对抗样本攻击研究. 计算机工程. 2024(10): 228-239 . 百度学术

    其他类型引用(10)

计量
  • 文章访问数:  115
  • HTML全文浏览量:  2
  • PDF下载量:  73
  • 被引次数: 13
出版历程
  • 发布日期:  2022-11-30

目录

    /

    返回文章
    返回