高级检索

    多类型概念漂移引导的在线分时适应性集成

    Online Time-sharing Adaptive Ensemble Guided by Multi-type Concept Drift

    • 摘要: 概念漂移是流数据在现实世界的一个重要特性,也是数据挖掘中不可避免的难题.在多类别概念漂移适应问题中,由于其训练速度较慢,故存在整体性能较好但漂移恢复速度较慢的问题.为此,本文提出了一种多类型概念漂移引导的在线分时适应性集成方法(Online Time-sharing Adaptive Ensemble Guided by Multi-type Concept Drift,OTAE),该方法通过计算不同数据块间的时间偏移距离,提取距离偏移序列,根据序列特征识别不同漂移类型;针对不同类型概念漂移,结合指数梯度下降模型遗憾边界,动态异步初始化模型权重,实现模型的持续异步权重更新;之后,结合数据分类特征,计算样本类内外距离,借此提取样本混合密度,生成密度权重矩阵,实现模型的短时权重调控,最后,将长期权重与短时权重矩阵结合,实现模型的双阶段加权集成.实验结果表明,该方法提高了模型对不同漂移类型的适应速度,取得了良好的预测性能.

       

      Abstract: Concept drift is an important feature of streaming data in the real world, and it is also an inevitable problem in data mining. In the multi-class concept drift adaptation problem, the training speed is slow, so the overall performance is good but the drift recovery speed is slow. Online Time-sharing Adaptive Ensemble Guided by Multi-type Concept Drift (OTAE), this method calculates the time offset distance between different data blocks, extracts the distance offset sequence, and identifies different drift types according to the sequence characteristics. According to different types of concept drift, combined with the regret boundary of exponential gradient descent model, the model weight is initialized asynchronously to achieve continuous asynchronous weighting of the model. Then, combined with the data classification features, the distance between inside and outside the sample class is calculated to extract the mixed density of samples, generate the density weight matrix, and realize the short-term weight control of the model. Finally, the long-term weight is combined with the short-term weight matrix to realize the two-stage weighted integration of the model. The experimental results show that the proposed method can improve the adaptability of the model to different drift types and obtain good prediction performance.

       

    /

    返回文章
    返回