高级检索

    基于图提示的半监督开放词汇多标记学习

    Semi-supervised Open Vocabulary Multi-label Learning via Graph Prompting

    • 摘要: 半监督多标记学习利用有标记数据和无标记数据进行模型的训练,降低了多标记数据的标记成本并取得了不错的结果,吸引了很多研究者不断进行研究. 然而,在半监督标注过程中,由于标记的数量较多,往往会出现某些标记缺失标注样本的情况,这些标记被称为开放词汇. 开放词汇会导致模型无法学习到该类别的标记信息,使得模型性能下降. 针对上述问题,提出了基于图提示的半监督开放词汇多标记学习方法. 具体地,该方法利用基于提示的图神经网络对预训练大模型进行微调,挖掘和探索开放词汇与监督样本之间的关系. 通过使用包含图像与文本的多模态数据构造图神经网络作为预训练大模型的文本输入进行学习. 其次利用预训练大模型在开放词汇上的泛化能力,对无监督样本生成伪标记,实现对输出分类层的微调,使模型在对开放词汇进行分类时能获得更加理想的效果. 多个基准数据集上的实验结果均显示,基于图提示的半监督开放词汇多标记学习方法优于目前的主流方法,在VOC,COCO,CUB,NUS等基准数据集上均取得了最优的效果.

       

      Abstract: Semi-supervised multi-label learning employs labeled and unlabeled data to train a model, which effectively achieves good results and reduces the labeling cost of multi-label data. Therefore, semi-supervised multi-label learning has attracted many researchers dedicated to this field. However, in the semi-supervised annotation process, due to the large number of labels, it is a common situation that some labels lack of samples, and these labels are called open vocabulary. It is difficult for the model to learn the label information of the open vocabulary, which leads to the degradation of its performance. To address the above problem, this paper proposes a semi-supervised open vocabulary multi-label learning method based on graph prompting. Specifically, this method uses a graph neural network via prompt to fine-tune the pre-trained model and explore the relationship between open vocabulary and supervised samples. By using images and text, we construct a graph neural network as the input of text for the pre-trained model. Furthermore, by leveraging the generalization ability of the pre-trained model on open vocabulary, pseudo-labels are generated for unsupervised samples. Then we use pseudo-labels to train the classification layer to enable the model to achieve better performance in classifying open vocabulary. Experimental results on multiple benchmark datasets, including VOC, COCO, CUB, and NUS, consistently demonstrate that the proposed method outperforms existing methods and achieves state-of-the-art performance.

       

    /

    返回文章
    返回