高级检索
    刘艳芳, 李文斌, 高阳. 基于被动-主动的特征演化流学习[J]. 计算机研究与发展, 2021, 58(8): 1575-1585. DOI: 10.7544/issn1000-1239.2021.20210330
    引用本文: 刘艳芳, 李文斌, 高阳. 基于被动-主动的特征演化流学习[J]. 计算机研究与发展, 2021, 58(8): 1575-1585. DOI: 10.7544/issn1000-1239.2021.20210330
    Liu Yanfang, Li Wenbin, Gao Yang. Passive-Aggressive Learning with Feature Evolvable Streams[J]. Journal of Computer Research and Development, 2021, 58(8): 1575-1585. DOI: 10.7544/issn1000-1239.2021.20210330
    Citation: Liu Yanfang, Li Wenbin, Gao Yang. Passive-Aggressive Learning with Feature Evolvable Streams[J]. Journal of Computer Research and Development, 2021, 58(8): 1575-1585. DOI: 10.7544/issn1000-1239.2021.20210330

    基于被动-主动的特征演化流学习

    Passive-Aggressive Learning with Feature Evolvable Streams

    • 摘要: 在许多现实应用中,数据以一种特征演化流的形式收集.例如,随着传感器的更换,由旧传感器收集的数据特征会消失,新传感器收集的数据特征会出现.在线被动-主动算法已被证明可以有效地从具有固定特征空间和梯形特征空间的数据集中学习线性分类器.因此,提出了一种基于被动-主动更新策略的特征演化学习算法(passive-aggressive learning with feature evolvable streams, PAFE).该算法通过主动-被动更新策略从当前特征空间和被恢复的已消失特征空间中学习了2个模型.具体来说,在重叠时段,即新旧特征同时存在的时段,该算法用新特征恢复了消失的特征空间,同时用旧特征空间模拟了新特征空间,进而为新特征空间的模型学习提供合理的初始化.基于这2个模型,为提高算法整体性能提出了2个集成算法:组合预测和当前最优预测.在合成数据集和真实数据集上的实验结果验证了该算法的有效性.

       

      Abstract: In many real-world applications, data are collected in the form of a feature evolvable stream. For instance, old features of data gathered by limited-lifespan sensors disappear and new features emerge at the same time along with the sensors exchanging simultaneously. Online passive-aggressive algorithms have proven to be effective in learning linear classifiers from datasets with both a fixed feature space and a trapezoidal feature space. Therefore, in this paper we propose a new feature evolvable learning based on passive-aggressive update strategy (PAFE), which utilizes the margin to modify the current classifier. The proposed algorithm learns two models through passive-aggressive update strategy from the current features and recovered features of the vanished features. Specifically, we both recover the vanished features and mine the initialization of the current model from the overlapping periods in which both old and new features are available. Furthermore, we use two ensemble methods to improve performance: combining the predictions from the two models, and dynamically selecting the best single prediction. Experiments on both synthetic and real data validate the effectiveness of our proposed algorithm.

       

    /

    返回文章
    返回