高级检索

    面向标记分布学习的标记增强

    Label Enhancement for Label Distribution Learning

    • 摘要: 多标记学习(multi-label learning, MLL)任务处理一个示例对应多个标记的情况,其目标是学习一个从示例到相关标记集合的映射.在MLL中,现有方法一般都是采用均匀标记分布假设,也就是各个相关标记(正标记)对于示例的重要程度都被当作是相等的.然而,对于许多真实世界中的学习问题,不同相关标记的重要程度往往是不同的.为此,标记分布学习将不同标记的重要程度用标记分布来刻画,已经取得很好的效果.但是很多数据中却仅包含简单的逻辑标记而非标记分布.为解决这一问题,可以通过挖掘训练样本中蕴含的标记重要性差异信息,将逻辑标记转化为标记分布,进而通过标记分布学习有效地提升预测精度.上述将原始逻辑标记提升为标记分布的过程,定义为面向标记分布学习的标记增强.首次提出了标记增强这一概念,给出了标记增强的形式化定义,总结了现有的可以用于标记增强的算法,并进行了对比实验.实验结果表明:使用标记增强能够挖掘出数据中隐含的标记重要性差异信息,并有效地提升MLL的效果.

       

      Abstract: Multi-label learning (MLL) deals with the case where each instance is associated with multiple labels. Its target is to learn the mapping from instance to relevant label set. Most existing MLL methods adopt the uniform label distribution assumption, i.e., the importance of all relevant (positive) labels is the same for the instance. However, for many real-world learning problems, the importance of different relevant labels is often different. For this issue, label distribution learning (LDL) has achieved good results by modeling the different importance of labels with a label distribution. Unfortunately, many datasets only contain simple logical labels rather than label distributions. To solve the problem, one way is to transform the logical labels into label distributions by mining the hidden label importance from the training examples, and then promote prediction precision via label distribution learning. Such process of transforming logical labels into label distributions is defined as label enhancement for label distribution learning. This paper first proposes the concept of label enhancement with a formal definition. Then, existing algorithms that can be used for label enhancement have been surveyed, and compared in the experiments. Results of the experiments reveal that label enhancement can effectively discover the difference of the label importance hidden in the data, and improve the performance of multi-label learning.

       

    /

    返回文章
    返回