高级检索

    面向特征继承性增减的在线分类算法

    Online Classification Algorithm with Feature Inheritably Increasing and Decreasing

    • 摘要: 近年来,在线学习由于其巨大的实际应用价值,已经得到人们广泛的研究.然而,在许多开放环境应用场景下,当前时刻数据可能会增加新的特征,而下一时刻只有部分原有特征得以继承.例如,在环境监测中,新的传感器部署会产生数据新特征;下一时刻部分旧的传感器失效,部分原有特征被保留.这样的数据被称为特征继承性增减的流式数据.传统的在线学习算法大多建立在数据特征空间稳定不变的基础之上,无法直接处理此种情形.针对上述问题,提出了一种面向特征继承性增减的在线分类算法(online classification algorithm with feature inheritably increasing and decreasing, OFID)及其2种变体.当新特征出现时,通过结合在线被动主动方法与结构风险最小化原则分别更新原始特征与新增特征上的分类器;当旧特征消失时,对数据流使用Frequent-Directions算法进行补全,使得旧分类器得以继续更新迭代.从理论上证明了OFID系列算法的损失上界,同时通过大量的实验验证了所提算法的有效性.

       

      Abstract: In recent years, online learning has been extensively studied due to its huge application value. However, in many open environment application scenarios, the data may have new features at the current moment, and only part of the original features at the next moment are inherited. For example, in environment monitoring, with the deployment of new sensors, new features appear; when some of the old sensors are out of operation, only some of the original features of the data are retained. In this paper, such data is called streaming data with inheritably increasing and decreasing features. Traditional online learning algorithms are based on the fixed feature space, and cannot directly deal with data with inheritably increasing and decreasing features. To solve the problem, we propose online classification with feature inheritably increasing and decreasing (OFID), together with its two variants. When new features appear, the classifiers on the original features and new features are updated by combining the online passive-aggressive algorithm and the principle of structural risk minimization. When the old features disappear, the frequent-directions algorithm is used to complete the data matrix which allows the old classifier to continue to update. We theoretically analyze the performance bounds of the proposed algorithms and extensive experiments demonstrate the effectiveness of our algorithms.

       

    /

    返回文章
    返回