一种基于最大值损失函数的快速偏标记学习算法

周瑜; 贺建军; 顾宏; 张俊星

doi:10.7544/issn1000-1239.2016.20150267

一种基于最大值损失函数的快速偏标记学习算法

A Fast Partial Label Learning Algorithm Based on Max-loss Function

摘要

摘要: 在弱监督信息条件下进行学习已成为大数据时代机器学习领域的研究热点，偏标记学习是最近提出的一种重要的弱监督学习框架,主要解决在只知道训练样本的真实标记属于某个候选标记集合的情况下如何进行学习的问题，在很多领域都具有广泛应用.最大值损失函数可以很好地描述偏标记学习中的样本与候选标记间的关系，但是由于建立的模型通常是一个难以求解的非光滑函数，目前还没有建立基于该损失函数的偏标记学习算法.此外，已有的偏标记学习算法都只能处理样本规模比较小的问题，还没看到面向大数据的算法.针对以上2个问题，先利用凝聚函数逼近最大值损失函数中的max(·)将模型的目标函数转换为一个光滑的凹函数，然后利用随机拟牛顿法对其进行求解，最终实现了一种基于最大值损失函数的快速偏标记学习算法.仿真实验结果表明，此算法不仅要比基于均值损失函数的传统算法取得更好的分类精度，运行速度上也远远快于这些算法，处理样本规模达到百万级的问题只需要几分钟.

Abstract: In the age of big data, learning with weak supervision has become one of the hot research topics in machine learning field. Partial label learning, which deals with the problem where each training example is associated with a set of candidate labels among which only one label corresponds to the ground-truth, is an important weakly-supervised machine learning frameworks proposed recently and can be widely used in many real world tasks. The max-loss function may be used to accurately capture the relationship between the partial labeled sample and its labels. However, since the max-loss function usually brings us a nondifferentiable objective function difficult to be solved, it is rarely adopted in the existing algorithms. Moreover, the existing partial label learning algorithms can only deal with the problem with small-scale data, and rarely can be used to deal with big data. To cure above two problems, this paper presents a fast partial label learning algorithm with the max-loss function. The basic idea is to transform the nondifferentiable objective to a differentiable concave function by introducing the aggregate function to approximate the max(·) function involved in the max-lass function, and then to solve the obtained concave objective function by using a stochastic quasi-Newton method. The experimental results show that the proposed algorithm can not only achieve higher accuracy but also use shorter computing time than the state-of-the-art algorithms with average-loss functions. Moreover, the proposed algorithm can deal with the problems with millions samples within several minutes.

HTML全文

参考文献(0)

施引文献

资源附件(0)