Zhang Lu, Cao Feng, Liang Xinyan, Qian Yuhua. Cross-Modal Retrieval with Correlation Feature Propagation[J]. Journal of Computer Research and Development, 2022, 59(9): 1993-2002. DOI: 10.7544/issn1000-1239.20210475
Citation:
Zhang Lu, Cao Feng, Liang Xinyan, Qian Yuhua. Cross-Modal Retrieval with Correlation Feature Propagation[J]. Journal of Computer Research and Development, 2022, 59(9): 1993-2002. DOI: 10.7544/issn1000-1239.20210475
Zhang Lu, Cao Feng, Liang Xinyan, Qian Yuhua. Cross-Modal Retrieval with Correlation Feature Propagation[J]. Journal of Computer Research and Development, 2022, 59(9): 1993-2002. DOI: 10.7544/issn1000-1239.20210475
Citation:
Zhang Lu, Cao Feng, Liang Xinyan, Qian Yuhua. Cross-Modal Retrieval with Correlation Feature Propagation[J]. Journal of Computer Research and Development, 2022, 59(9): 1993-2002. DOI: 10.7544/issn1000-1239.20210475
1(Institute of Big Data Science and Industry,Shanxi University, Taiyuan 030006)
2(Key Laboratory of Computational Intelligence and Chinese Information Processing(Shanxi University),Ministry of Education, Taiyuan 030006)
3(School of Computer and Information Technology,Shanxi University, Taiyuan 030006)
Funds: This work was supported by the National Natural Science Foundation of China (61672332, 62136005), the Key Research and Development Program of Shanxi Province (201903D421003), and the Science and Technology Achievements Transformation and Cultivation Project of Shanxi Provincial Education Department (2020CG001).
With the rapid development of deep learning and the deep research of correlation learning, the performance of cross-modal retrieval has been greatly improved. The challenge of cross-modal retrieval research is that different modal data are related in high-level semantics, but there is a heterogeneous gap in low-level features. The existing methods mainly map the features of different modalities to feature space with certain correlation by single correlation constraint to solve the heterogeneous gap problem of the low-level features. However, representation learning shows that different layers of features can help improve the final performance of the model. Therefore, the correlation of the single feature space learned by existing methods is weak, namely the feature space may not be the optimal retrieval space. In order to solve this problem, we propose the modal of cross-modal retrieval with correlation feature propagation. Its basic idea is to strengthen the correlation between the layers of the deep network, namely the characteristics of the former layer with certain correlation are transmitted to the latter layer through nonlinear changes, which is more conducive to find the feature space that makes the two modalities more correlated. A lot of experiments on Wikipedia, Pascal data sets show that this method can improve mean average precision.