基于自步学习的开放集领域自适应

刘星宏; 周毅; 周涛; 秦杰

doi:10.7544/issn1000-1239.202330210

摘要: 领域自适应的目的是将从源领域获得的知识泛化到具有不同数据分布的目标领域. 传统的领域自适应方法假设源域和目标域的类别是相同的，但在现实世界的场景中并非总是如此. 为了解决这个缺点，开放集领域自适应在目标域中引入了未知类以代表源域中不存在的类别. 开放集领域自适应旨在不仅识别属于源域和目标域共享的已知类别样本，还要识别未知类别样本. 传统的领域自适应方法旨在将整个目标域与源域对齐以最小化域偏移，这在开放集领域自适应场景中不可避免地导致负迁移. 为了解决开放集领域自适应带来的挑战，提出了一种基于自步学习的新颖框架SPL-OSDA (self-paced learning for open-set domain adaptation)，用于精确区分已知类和未知类样本，并进行领域自适应. 为了利用未标记的目标域样本实现自步学习，为目标域样本生成伪标签，并为开放集领域自适应场景设计一个跨领域混合方法. 这种方法最大程度地减小了伪标签的噪声，并确保模型逐步从简单到复杂的例子中学习目标域的已知类特征. 为了提高模型在开放场景的可靠性以满足开放场景可信人工智能的要求，引入了多个准则以区分已知类和未知类样本. 此外，与现有的需要手动调整超参数阈值以区分已知类和未知类的开集领域自适应方法不同，所提方法可以自动调整合适的阈值，无需在测试过程中进行经验性调参. 与经验性调整阈值相比，所提的模型在不同超参数及实验设定下都表现出了良好的鲁棒性. 实验结果表明，与各种最先进的方法相比，所提方法在不同的基准测试中始终取得卓越的性能.

Abstract: Domain adaptation tackles the challenge of generalizing knowledge acquired from a source domain to a target domain with different data distributions. Traditional domain adaptation methods presume that the classes in the source and target domains are identical, which is not always the case in real-world scenarios. Open-set domain adaptation (OSDA) addresses this limitation by allowing previously unseen classes in the target domain. OSDA aims to not only recognize target samples belonging to known classes shared by source and target domains but also perceive unknown class samples. Traditional domain adaptation methods aim to align the entire target domain with the source domain to minimize domain shift, which inevitably leads to negative transfer in open-set domain adaptation scenarios. We propose a novel framework based on self-paced learning to distinguish known and unknown class samples precisely, referred to as SPL-OSDA (self-paced learning for open-set domain adaptation). To utilize unlabeled target samples for self-paced learning, we generate pseudo labels and design a cross-domain mixup method tailored for OSDA scenarios. This strategy minimizes the noise from pseudo labels and ensures our model progressively to learn known class features of the target domain, beginning with simpler examples and advancing to more complex ones. To improve the reliability of the model in open-set scenarios to meet the requirements of trustworthy AI, multiple criteria are utilized in this paper to distinguish between known and unknown samples. Furthermore, unlike existing OSDA methods that require manual hyperparameter threshold tuning to separate known and unknown classes, our propused method self-tunes a suitable threshold, eliminating the need for empirical tuning during testing. Compared with empirical threshold tuning, our model exhibits good robustness under different hyperparameters and experimental settings. Comprehensive experiments illustrate that our method consistently achieves superior performance on different benchmarks compared with various state-of-the-art methods.

基于自步学习的开放集领域自适应

Self-Paced Learning for Open-Set Domain Adaptation