• 中国精品科技期刊
  • CCF推荐A类中文期刊
  • 计算领域高质量科技期刊T1类
Advanced Search
Guo Husheng, Zhang Yutong, Wang Wenjian. Elastic Gradient Ensemble for Concept Drift Adaptation[J]. Journal of Computer Research and Development. DOI: 10.7544/issn1000-1239.202440407
Citation: Guo Husheng, Zhang Yutong, Wang Wenjian. Elastic Gradient Ensemble for Concept Drift Adaptation[J]. Journal of Computer Research and Development. DOI: 10.7544/issn1000-1239.202440407

Elastic Gradient Ensemble for Concept Drift Adaptation

Funds: This work was supported by the National Natural Science Foundation of China (62276157, U21A20513, 62076154) and the Key Research & Development Program of Shanxi Province (202202020101003).
More Information
  • Author Bio:

    Guo Husheng: born in 1986. PhD, professor and PhD supervisor. Senior member of CCF. His main research interests include machine learning, data mining and computational intelligence

    Zhang Yutong: born in 1999. Master candidate. His main research interests include stream data mining and online machine learning

    Wang Wenjian: born in 1968. PhD, professor, PhD supervisor. Outstanding member of CCF. Her main research interests include machine learning, data mining and computational intelligence

  • Received Date: May 30, 2024
  • Revised Date: September 09, 2024
  • Accepted Date: October 14, 2024
  • Available Online: October 21, 2024
  • With the surge of streaming data, concept drift has become an important and challenging problem in streaming data mining. At present, most ensemble learning methods do not specifically identify the types of concept drift and adopt efficient ensemble adaptation strategies, resulting in uneven performance of models on different concept drift types. To address this, this paper proposes an elastic gradient ensemble for concept drift adaptation (EGE_CD). Firstly, the gradient boosting residual was extracted and the flow residual ratio was calculated to detect the drift site, and then the residual volatility was calculated to identify the type of drift. Then, the drift learners are extracted by using the change of learner loss, and the corresponding learners are deleted by combining different drift types and residual distribution characteristics to realize elastic gradient pruning. Finally, the incremental learning method was combined with the sliding sampling method to optimize the fitting process of the learner by calculating the optimal fitting rate, and then the incremental gradient growth was realized according to the change of the residual of the learner. The experimental results show that the proposed method improves the stability and adaptability of the model to different concept drift types and achieves good generalization performance.

  • [1]
    Habeeb R A A,Nasaruddin F,Gani A,et al. Real-time big data processing for anomaly detection:A survey[J]. International Journal of Information Management,2019,45:289−307(只有卷
    [2]
    翟婷婷,高阳,朱俊武. 面向流数据分类的在线学习综述[J]. 软件学报,2020,31(4):912−931

    Zhai Tingting, Gao Yang, Zhu Junwu. Survey of online learning algorithms for streaming data classification[J]. Journal of Software, 2020, 31(4): 912−931 (in Chinese)
    [3]
    杜航原,王文剑,白亮. 一种基于优化模型的演化数据流聚类方法[J]. 中国科学:信息科学,2017,47(11):1464−1482 doi: 10.1360/N112017-00107

    Du Hangyuan, Wang Wenjian, Bai Liang. A novel evolving data stream clustering method based on optimization model[J]. Scientia Sinica Informationis, 2017, 47(11): 1464−1482 (in Chinese) doi: 10.1360/N112017-00107
    [4]
    文益民,刘帅,缪裕青,等. 概念漂移数据流半监督分类综述[J]. 软件学报,2022,33(4):1287−1314

    Wen Yiming, Liu Shuai, Miao Yuqing, et al. Survey on semi-supervised classification of data streams with concept drifts[J]. Journal of Software, 2022, 33(4): 1287−1314 (in Chinese)
    [5]
    Jothimurugesan E, Hsieh K, Wang J, et al. Federated learning under distributed concept drift[C]//Proc of the 26th Int Conf on Artificial Intelligence and Statistics. New York: PMLR, 2023: 5834−5853
    [6]
    Liu Sanmin, Xue Shan, Wu Jia, et al. Online active learning for drifting data streams[J]. IEEE Transactions on Neural Networks and Learning Systems, 2021, 34(1): 186−200
    [7]
    Lu Jie, Liu Anjin, Dong Fan, et al. Learning under concept drift: A review[J]. IEEE Transactions on Knowledge and Data Engineering, 2018, 31(12): 2346−2363
    [8]
    Karimian M,Beigy H. Concept drift handling:A domain adaptation perspective[J]. Expert Systems with Applications,2023,224:119946(只有卷和编号
    [9]
    Chen Yingying,Yang Xiaowei,Dai Hongliang. Cost-sensitive continuous ensemble kernel learning for imbalanced data streams with concept drift[J]. Knowledge-Based Systems,2024,284:111272(只有卷和编号
    [10]
    Gomes H M, Barddal J P, Enembreck F, et al. A survey on ensemble learning for data stream classification[J]. ACM Computing Surveys, 2017, 50(2): 1−36
    [11]
    Liu Weike,Zhang Hang,Ding Zhaoyun,et al. A comprehensive active learning method for multiclass imbalanced data streams with concept drift[J]. Knowledge-Based Systems,2021,215:106778(只有卷
    [12]
    Celik B, Vanschoren J. Adaptation strategies for automated machine learning on evolving data[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021, 43(9): 3067−3078 doi: 10.1109/TPAMI.2021.3062900
    [13]
    Bifet A, Gavalda R. Learning from time-changing data with adaptive windowing[C]//Proc of the 7th SIAM Int Conf on Data Mining. Philadelphia, PA: SIAM, 2007: 443−448
    [14]
    Gama J, Medas P, Castillo G, et al. Learning with drift detection[C]//Proc of the 17th Brazilian Symp on Artificial Intelligence. Berlin: Springer, 2004: 286−295
    [15]
    郭虎升,任巧燕,王文剑. 基于时序窗口的概念漂移类别检测[J]. 计算机研究与发展,2022,59(1):127−143 doi: 10.7544/issn1000-1239.20200562

    Guo Husheng, Ren Qiaoyan, Wang Wenjian. Concept drift class detection based on time window[J]. Journal of Computer Research and Development, 2022, 59(1): 127−143 (in Chinese) doi: 10.7544/issn1000-1239.20200562
    [16]
    Neto A F,Canuto A M P. EOCD:An ensemble optimization approach for concept drift applications[J]. Information Sciences,2021,561:81−100(只有卷
    [17]
    Hinder F, Artelt A, Hammer B. Towards non-parametric drift detection via dynamic adapting window independence drift detection (dawidd)[C]//Proc of the 37th Int Conf on Machine Learning. New York: PMLR, 2020: 4249−4259
    [18]
    Gözüaçık Ö, Can F. Concept learning using one-class classifiers for implicit drift detection in evolving data streams[J]. Artificial Intelligence Review, 2021, 54(5): 3725−3747 doi: 10.1007/s10462-020-09939-x
    [19]
    Xu Shuliang,Feng Lin,Liu Shenglan,et al. Self-adaption neighborhood density clustering method for mixed data stream with concept drift[J]. Engineering Applications of Artificial Intelligence,2020,89:103451(只有卷
    [20]
    Street W N, Kim Y S. A streaming ensemble algorithm (SEA) for large-scale classification[C]//Proc of the 7th ACM SIGKDD Int Conf on Knowledge Discovery and Data Mining. New York: ACM, 2001: 377−382
    [21]
    Lu Yang, Cheung Yiu-ming, Tang Yuanyan. Adaptive chunk-based dynamic weighted majority for imbalanced data streams with concept drift[J]. IEEE Transactions on Neural Networks and Learning Systems, 2019, 31(8): 2764−2778
    [22]
    Guo Husheng,Zhang Shuai,Wang Wenjian. Selective ensemble-based online adaptive deep neural networks for streaming data with concept drift[J]. Neural Networks,2021,142:437−456(只有卷
    [23]
    Weinberg A I,Last M. EnHAT—Synergy of a tree-based ensemble with hoeffding adaptive tree for dynamic data streams mining[J]. Information Fusion,2023,89:397−404(只有卷
    [24]
    Brzezinski D,Stefanowski J. Combining block-based and online methods in learning ensembles from concept drifting data streams[J]. Information Sciences,2014,265:50−67(只有卷
    [25]
    Brzezinski D, Stefanowski J. Reacting to different types of concept drift: The accuracy updated ensemble algorithm[J]. IEEE Transactions on Neural Networks and Learning Systems, 2013, 25(1): 81−94
    [26]
    Oza N C, Russell S J. Online bagging and boosting[C]//Proc of the 8th Int Workshop on Artificial Intelligence and Statistics. New York: PMLR, 2001: 229−236
    [27]
    Breiman L. Bagging predictors[J]. Machine Learning, 1996, 24(2): 123−140
    [28]
    Zyblewski P,Sabourin R,Woźniak M. Preprocessed dynamic classifier ensemble selection for highly imbalanced drifted data streams[J]. Information Fusion,2021,66:138−154(只有卷
    [29]
    Zhang Yifan, Wen Qingsong, Wang Xue, et al. Onenet: Enhancing time series forecasting models under concept drift by online ensembling[C] //Proc of the 37th Int Conf on Neural Information Processing Systems. New York: ACM, 2023: 69949−69980
    [30]
    Friedman J H. Greedy function approximation: A gradient boosting machine[J]. Annals of Statistics, 2001, 29(5): 1189−1232 doi: 10.1214/aos/1013203450
    [31]
    Bifet A, Holmes G, Pfahringer B, et al. Moa: Massive online analysis, a framework for stream classification and clustering[C]//Proc of the 1st Workshop on Applications of Pattern Analysis. New York: PMLR, 2010: 44−50
    [32]
    赵鹏,周志华. 基于决策树模型重用的分布变化流数据学习[J]. 中国科学:信息科学,2021,51(1):1−12 doi: 10.1360/SSI-2020-0170

    Zhao Peng, Zhou Zhihua. Learning from distribution-changing data streams via decision tree model reuse[J]. Scientia Sinica Informationis, 2021, 51(1): 1−12 (in Chinese) doi: 10.1360/SSI-2020-0170
    [33]
    Kolter J Z,Maloof M A. Dynamic weighted majority:An ensemble method for drifting concepts[J]. The Journal of Machine Learning Research,2007,8:2755−2790(只有卷
    [34]
    Elwell R, Polikar R. Incremental learning of concept drift in nonstationary environments[J]. IEEE Transactions on Neural Networks, 2011, 22(10): 1517−1531 doi: 10.1109/TNN.2011.2160459
    [35]
    Bifet A,Holmes G,Pfahringer B. Leveraging bagging for evolving data streams[C]//Proc of the Machine Learning and Knowledge Discovery in Databases:European Conf. Berlin:Springer,2010:135−150(没有届
    [36]
    Bifet A, Holmes G, Pfahringer B, et al. New ensemble methods for evolving data streams[C]//Proc of the 15th ACM SIGKDD Int Conf on Knowledge Discovery and Data Mining. New York: ACM, 2009: 139−148
    [37]
    Wang B, Pineau J. Online bagging and boosting for imbalanced data streams[J]. IEEE Transactions on Knowledge and Data Engineering, 2016, 28(12): 3353−3366 doi: 10.1109/TKDE.2016.2609424
    [38]
    Demšar J. Statistical comparisons of classifiers over multiple data sets[J]. The Journal of Machine Learning Research,2006,7:1−30(只有卷

Catalog

    Article views (37) PDF downloads (7) Cited by()

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return