基于多核集成的在线半监督学习方法

黎  铭  周志华

基于多核集成的在线半监督学习方法

黎铭周志华

Online Semi-Supervised Learning with Multi-Kernel Ensemble

Li Ming and Zhou Zhihua

摘要

摘要: 在很多实时预测任务中，学习器需对实时采集到的数据在线地进行学习.由于数据采集的实时性，往往难以为采集到的所有数据提供标记.然而，目前的在线学习方法并不能利用未标记数据进行学习，致使学得的模型并不能即时反映数据的动态变化，降低其实时响应能力.提出一种基于多核集成的在线半监督学习方法，使得在线学习器即使在接收到没有标记的数据时也能进行在线学习.该方法采用多个定义在不同RKHS中的函数对未标记数据预测的一致程度作为正则化项，在此基础上导出了多核集成在线半监督学习的即时风险函数，然后借助在线凸规划技术进行求解.在UCI数据集上的实验结果以及在网络入侵检测上的应用表明，该方法能够有效利用数据流中未标记数据来提升在线学习的性能.

Abstract: In many practical real-time applications, prediction functions should be learned online upon the examples arriving in sequence. It is usually infeasible to label all the examples in the stream. However, most of the state-of-art online learning methods that tackle the real-time prediction problem work are not able to exploit the unlabeled data. In this paper, an online semi-supervised learning method based on multi-kernel ensemble is proposed, which enables online learning even if the received example is unlabeled. This method exploits the compatibility of multiple learners from different RKHS over the unlabeled data, based on which a regularized instantaneous risk functional is derived. Online convex programming is employed to minimize the derived risk. The experimental results on UCI data sets and the application to network intrusion detection show that the proposed method can effectively exploit the unlabeled data to improve the performance of online learning.

HTML全文

参考文献(0)

施引文献

资源附件(0)