ISSN 1000-1239 CN 11-1777/TP

计算机研究与发展 ›› 2021, Vol. 58 ›› Issue (8): 1575-1585.doi: 10.7544/issn1000-1239.2021.20210330

所属专题: 2021人工智能前沿进展专题

• 人工智能 • 上一篇    下一篇



  1. 1(计算机软件新技术国家重点实验室(南京大学) 南京 210023);2(龙岩学院数学与信息工程学院 福建龙岩 364012) (
  • 出版日期: 2021-08-01
  • 基金资助: 

Passive-Aggressive Learning with Feature Evolvable Streams

Liu Yanfang1,2, Li Wenbin1, Gao Yang1   

  1. 1(State Key Laboratory for Novel Software Technology (Nanjing University), Nanjing 210023);2(College of Mathematics and Information Engineering, Longyan University, Longyan, Fujian 364012)
  • Online: 2021-08-01
  • Supported by: 
    This work was supported by the National Key Research and Development Program of China (2018AAA0100905), the Education Scientific Research Project of Young Teachers of Fujian Province (JAT190743), and the Science and Technology Project of Longyan City (2019LYF13002, 2019LYF12010).

摘要: 在许多现实应用中,数据以一种特征演化流的形式收集.例如,随着传感器的更换,由旧传感器收集的数据特征会消失,新传感器收集的数据特征会出现.在线被动-主动算法已被证明可以有效地从具有固定特征空间和梯形特征空间的数据集中学习线性分类器.因此,提出了一种基于被动-主动更新策略的特征演化学习算法(passive-aggressive learning with feature evolvable streams, PAFE).该算法通过主动-被动更新策略从当前特征空间和被恢复的已消失特征空间中学习了2个模型.具体来说,在重叠时段,即新旧特征同时存在的时段,该算法用新特征恢复了消失的特征空间,同时用旧特征空间模拟了新特征空间,进而为新特征空间的模型学习提供合理的初始化.基于这2个模型,为提高算法整体性能提出了2个集成算法:组合预测和当前最优预测.在合成数据集和真实数据集上的实验结果验证了该算法的有效性.

关键词: 在线学习, 被动-主动策略, 监督学习, 集成学习, 演化特征

Abstract: In many real-world applications, data are collected in the form of a feature evolvable stream. For instance, old features of data gathered by limited-lifespan sensors disappear and new features emerge at the same time along with the sensors exchanging simultaneously. Online passive-aggressive algorithms have proven to be effective in learning linear classifiers from datasets with both a fixed feature space and a trapezoidal feature space. Therefore, in this paper we propose a new feature evolvable learning based on passive-aggressive update strategy (PAFE), which utilizes the margin to modify the current classifier. The proposed algorithm learns two models through passive-aggressive update strategy from the current features and recovered features of the vanished features. Specifically, we both recover the vanished features and mine the initialization of the current model from the overlapping periods in which both old and new features are available. Furthermore, we use two ensemble methods to improve performance: combining the predictions from the two models, and dynamically selecting the best single prediction. Experiments on both synthetic and real data validate the effectiveness of our proposed algorithm.

Key words: online learning, passive-aggressive strategy, supervised learning, ensemble learning, evolvable features