Abstract:
Partial label learning is a weakly-supervised machine learning framework proposed recently. Since it loosens the requirement to training data set, i.e. the learning model can be obtained when each training example is only associated with a candidate set of the ground-truth labels, and partial label learning framework can be used to deal with many real-world tasks more conveniently. The ambiguity in training data inevitably makes partial label learning problem more difficult to be addressed than traditional classification problem, and only several algorithms for small-scale training set are available up to the present. Based on ECOC technology and variational Gaussian process model, this paper presents a fast kernel-based partial label learning algorithm which can deal with large-scale problem effectively. The basic strategy is to convert the original training data set into several standard two-class data sets by using ECOC technology firstly, and then to develop a binary classify with lower computational complexity on each two-class data set by using variational Gaussian process model. The experimental results show that the proposed algorithm can achieve almost the same accuracy as the existing state-of-the-art kernel-based partial label learning algorithms but use shorter computing time. More specifically, the proposed algorithm can deal with the problems with millions samples within 40 minutes on a personal computer.