Abstract:
The distribution shift in the real world will degrade the model performance, so unsupervised domain adaptation is a valuable solution that is of great importance with high application value. Training data
XS and the model’s test data
XT are assumed to be independent and identically distributed (IID) in traditional machine learning. But distribution shift can be found in the training and test data in most cases, leading to the loss of robustness of the models based on the IID assumption. In view of this, researchers at home and abroad have conducted a lot of research on the theoretical basis and methodological techniques in the direction of unsupervised domain adaptation, which has led to the development of many application areas, including autonomous driving, intelligent medical care, etc. It is worth noting that there are some problems such as whether the distance obtained by the existing methods can truly represent the difference between the two distributions, and the ways of more accurately measuring the difference between the two distributions in the mainstream unsupervised domain adaptation approaches. Moreover, how to verify the model’s ability to transfer knowledge during training with the help of pseudo labeling is also an issue worthy of continuously exploring. Hence, we propose a backward pseudo-label and optimal transport for unsupervised domain adaptation (BPLOT) in order to explicitly reduce the distance between the two distributions in the feature space by calculating the difference between the two distributions more accurately from the perspective of optimal transport. More than that, a backward pseudo-label verification method is also proposed to verify the mode’s ability of knowledge transfer and guide model training by means of verifying the quality of pseudo-label during training. At last, the proposed network is experimentally verified on a plurality of datasets for unsupervised domain adaptation. The BPLOT model is superior to all compared baseline methods in terms of the effect.