弹性梯度集成的概念漂移适应

郭虎升; 张羽桐; 王文剑

doi:10.7544/issn1000-1239.202440407

弹性梯度集成的概念漂移适应

Elastic Gradient Ensemble for Concept Drift Adaptation

摘要

摘要: 随着流数据的大量涌现，概念漂移已成为流数据挖掘中备受关注且具有挑战性的重要问题. 目前，多数集成学习方法未针对性地识别概念漂移类型，并采取高效的集成适应策略，导致模型在不同漂移类型上的性能参差不齐. 为此，提出了一种弹性梯度集成的概念漂移适应（elastic gradient ensemble for concept drift adaptation, EGE_CD）方法. 该方法首先通过提取梯度提升残差，计算流动残差比检测漂移位点，之后计算残差波动率识别漂移类型；然后，利用学习器损失变化提取漂移学习器，结合不同漂移类型与残差分布特征删除对应学习器，实现弹性梯度剪枝；最后，将增量学习与滑动采样方法结合，通过计算最优拟合率优化学习器拟合过程，再根据残差变化实现增量梯度生长. 实验结果表明，所提方法提高了模型对不同漂移类型的稳定性与适应性，取得了良好的泛化性能.

Abstract: With the surge of streaming data, concept drift has become an important and challenging problem in streaming data mining. At present, most ensemble learning methods do not specifically identify the types of concept drift and do not adopt efficient ensemble adaptation strategies, resulting in uneven performance of models on different concept drift types. To address this, we propose an elastic gradient ensemble for concept drift adaptation (EGE_CD). Firstly, the gradient boosting residual is extracted and the flow residual ratio is calculated to detect the drift site, and then the residual volatility is calculated to identify the type of drift. Then, the drift learners are extracted by using the change of learner loss, and the corresponding learners are deleted by combining different drift types and residual distribution characteristics to realize elastic gradient pruning. Finally, the incremental learning method is combined with the sliding sampling method to optimize the fitting process of the learner by calculating the optimal fitting rate, and then the incremental gradient growth is realized according to the change of the residual of the learner. The experimental results show that the proposed method improves the stability and adaptability of the model to different concept drift types and achieves good generalization performance.

HTML全文

参考文献(38)

施引文献

资源附件(0)