Citation: | Li Ang, Du Junping, Kou Feifei, Xue Zhe, Xu Xin, Xu Mingying, Jiang Yang. Scientific and Technological Information Oriented Semantics-Adversarial and Media-Adversarial Based Cross-Media Retrieval Method[J]. Journal of Computer Research and Development, 2023, 60(11): 2660-2670. DOI: 10.7544/issn1000-1239.202220430 |
Cross-media retrieval of scientific and technological information is one of the important tasks in the cross-media study. Cross-media scientific and technological information retrieval obtains target information from massive multi-source and heterogeneous scientific and technological resources, which helps to design applications that meet users’ needs, including scientific and technological information recommendation, personalized scientific and technological information retrieval, etc. The core of cross-media retrieval is to learn a common subspace, in which data from different media can be directly compared with each other. In subspace learning, existing methods often focus on modeling the discrimination of intra-media data and the invariance of inter-media data after mapping, while ignoring semantic consistency within media and media discrimination within semantics, which limits the result of cross-media retrieval. In light of this, we propose a scientific and technological information oriented semantics-adversarial and media-adversarial cross-media retrieval method (SMCR) to find an effective common subspace. Specifically, SMCR minimizes the loss of inter-media semantic consistency in addition to modeling intra-media semantic discrimination, to preserve semantic similarity before and after mapping. Furthermore, SMCR constructs a basic feature mapping network and a refined feature mapping network to jointly minimize the media discriminative loss within semantics, to enhance the feature mapping network’s ability to confuse the media discriminant network. Experimental results on two datasets demonstrate that the proposed SMCR outperforms state-of-the-art methods in cross-media retrieval.
[1] |
方滨兴. 建设新时代网络空间安全产学研合作促进新态势[J]. 中国科技产业,2022,36(2):4−5 doi: 10.16277/j.cnki.cn11-2502/n.2022.02.003
Fang Binxing. Building a new era of cyberspace security, promoting new trends in industry university research cooperation[J]. Science & Technology Industry of China, 2022, 36(2): 4−5 (in Chinese) doi: 10.16277/j.cnki.cn11-2502/n.2022.02.003
|
[2] |
Peng Yuxin, Huang Xin, Zhao Yunzhen. An overview of cross-media retrieval: Concepts, methodologies, benchmarks, and challenges[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2018, 28(9): 2372−2385 doi: 10.1109/TCSVT.2017.2705068
|
[3] |
Kaur P, Pannu S, Malhi K. Comparative analysis on cross-modal information retrieval: A review [J]. Computer Science Review, 2021, 39(1): 100336
|
[4] |
冯霞,胡志毅,刘才华. 跨模态检索研究进展综述[J]. 计算机科学,2021,48(8):13−23 doi: 10.11896/jsjkx.200800165
Feng Xia, Hu Zhiyi, Liu Caihua. Survey of research progress on cross-modal retrieval[J]. Computer Science, 2021, 48(8): 13−23 (in Chinese) doi: 10.11896/jsjkx.200800165
|
[5] |
刘兴波,聂秀山,尹义龙. 基于双向线性回归的监督离散跨模态散列方法[J]. 计算机研究与发展,2020,57(8):1707−1714 doi: 10.7544/issn1000-1239.2020.20200122
Liu Xingbo, Nie Xiushan, Yin Yilong. Mutual linear regression based supervised discrete cross-modal hashing[J]. Journal of Computer Research and Development, 2020, 57(8): 1707−1714 (in Chinese) doi: 10.7544/issn1000-1239.2020.20200122
|
[6] |
Li Weiling, Tang Yong, Chen Guohua, et al. Implementation of academic news recommendation system based on user profile and message semantics [C] //Proc of the 9th Int Symp on Intelligence Computation and Applications. Berlin: Springer, 2017: 531−540
|
[7] |
Salehi S, Du J, Ashman H. Examining personalization in academic web search [C] //Proc of the 26th ACM Conf on Hypertext & Social Media (HT’15). New York: ACM, 2015: 103−111
|
[8] |
Wei Yunchao, Zhao Yao, Zhu Zhenfeng, et al. Modality-dependent cross-media retrieval[J]. ACM Transactions on Intelligent Systems and Technology, 2016, 7(4): 1−13
|
[9] |
张璐,曹峰,梁新彦,等. 基于关联特征传播的跨模态检索[J]. 计算机研究与发展,2022,59(9):1993−2002 doi: 10.7544/issn1000-1239.20210475
Zhang Lu, Cao Feng, Liang Xinyan, et al. Cross-modal retrieval with correlation feature propagation[J]. Journal of Computer Research and Development, 2022, 59(9): 1993−2002 (in Chinese) doi: 10.7544/issn1000-1239.20210475
|
[10] |
Wang Kaiye, He Ran, Wang Wei, et al. Learning coupled feature spaces for cross-modal matching [C] //Proc of the 13th IEEE Int Conf on Computer Vision (ICCV’13). Piscataway, NJ: IEEE, 2013: 2088−2095
|
[11] |
Hu Weiming, Gao Jun, Li Bing, et al. Anomaly detection using local kernel density estimation and context-based regression[J]. IEEE Transactions on Knowledge and Data Engineering, 2018, 32(2): 218−233
|
[12] |
Hardoon D, Szedmak S, Shawe-Taylor J. Canonical correlation analysis: An overview with application to learning methods[J]. Neural Computation, 2004, 16(12): 2639−2664 doi: 10.1162/0899766042321814
|
[13] |
Wang Kaiye, He Ran, Wang Liang, et al. Joint feature selection and subspace learning for cross-modal retrieval[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2011, 38(10): 2010−2023
|
[14] |
Zhai Xiaohua, Peng Yuxin, Xiao Jianguo. Learning cross-media joint representation with sparse and semisupervised regularization[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2014, 24(6): 965−978 doi: 10.1109/TCSVT.2013.2276704
|
[15] |
Gong Yunchao, Ke Qifa, Isard M, et al. A multi-view embedding space for modeling internet images, tags, and their semantics[J]. International Journal of Computer Vision, 2014, 106(2): 210−233 doi: 10.1007/s11263-013-0658-4
|
[16] |
Feng Fangxiang, Wang Xiaojie, Li Ruifan. Cross-modal retrieval with correspondence autoencoder [C] //Proc of the 22nd ACM Int Conf on Multimedia (MM’14). New York: ACM, 2014: 7−16
|
[17] |
Yan Fei, Mikolajczyk K. Deep correlation for matching images and text [C] //Proc of the 33rd IEEE Conf on Computer Vision and Pattern Recognition (CVPR’15). Piscataway, NJ: IEEE, 2015: 3441−3450
|
[18] |
Peng Yuxin, Huang Xin, Qi Jinwei. Cross-media shared representation by hierarchical learning with multiple deep networks [C] //Proc of the 25th Int Joint Conf on Artificial Intelligence (IJCAI’16). Palo Alto, CA: AAAI, 2016: 3846−3853
|
[19] |
Kou Feifei, Du Junping, He Yijiang, et al. Social network search based on semantic analysis and learning[J]. CAAI Transactions on Intelligence Technology, 2016, 1(4): 293−302 doi: 10.1016/j.trit.2016.12.001
|
[20] |
Xu Liang, Du Junping, Li Qingping. Image fusion based on nonsubsampled contourlet transform and saliency-motivated pulse coupled neural networks [J]. Mathematical Problems in Engineering, 2013, 19(1): 135182
|
[21] |
Ngiam J, Khosla A, Kim M, et al. Multimodal deep learning [C] //Proc of the 28th Int Conf on Machine Learning (ICML’11). New York: ACM, 2011: 689−696
|
[22] |
He Li, Xu Xing, Lu Huimin, et al. Unsupervised cross-modal retrieval through adversarial learning [C] //Proc of the 18th IEEE Int Conf on Multimedia and Expo (ICME’17). Piscataway, NJ: IEEE, 2017: 1153−1158
|
[23] |
Li Chao, Deng Cheng, Li Ning, et al. Self-supervised adversarial hashing networks for cross-modal retrieval [C] //Proc of the 36th IEEE Conf on Computer Vision and Pattern Recognition (CVPR’18). Piscataway, NJ: IEEE, 2018: 4242−4251
|
[24] |
Wang Bokun, Yang Yang, Xu Xing, et al. Adversarial cross-modal retrieval [C] //Proc of the 25th ACM Int Conf on Multimedia (MM’17). New York: ACM, 2017: 154−162
|
[25] |
Zhen Liangli, Hu Peng, Wang Xu, et al. Deep supervised cross-modal retrieval [C] //Proc of the 37th IEEE Conf on Computer Vision and Pattern Recognition (CVPR’19). Piscataway, NJ: IEEE, 2019: 10386−10395
|
[26] |
刘翀,杜军平,周南. 一种基于对抗学习和语义相似度的社交网络跨媒体搜索方法[J]. 中国科学:信息科学,2021,51(5):779−794 doi: 10.1360/SSI-2019-0120
Liu Chong, Du Junping, Zhou Nan. A cross media search method for social networks based on adversarial learning and semantic similarity[J]. SCIENTIA SINICA Informationis, 2021, 51(5): 779−794 (in Chinese) doi: 10.1360/SSI-2019-0120
|
[27] |
Goodfellow I, Pouget-Abadie J, Mirza M, et al. Generative adversarial nets [C] //Proc of the 27th Int Conf on Neural Information Processing Systems (NIPS’14). Cambridge, MA: MIT Press, 2014: 2672−2680
|
[28] |
Li Chao, Deng Cheng, Li Ning, et al. Self-supervised adversarial hashing networks for cross-modal retrieval [C] //Proc of the 36th IEEE Conf on Computer Vision and Pattern Recognition (CVPR’18). Piscataway, NJ: IEEE, 2018: 4242−4251
|
[29] |
Yu Chaohui, Wang Jindong, Chen Yiqiang, et al. Transfer learning with dynamic adversarial adaptation network [C] //Proc of the 19th IEEE Int Conf on Data Mining (ICDM’19). Piscataway, NJ: IEEE, 2019: 778−786
|
[30] |
Xue Zhe, Du Junping, Du Dawei, et al. Deep low-rank subspace ensemble for multi-view clustering[J]. Information Sciences, 2019, 482(5): 210−227
|
[31] |
Fang Yuke, Deng Weihong, Du Junping, et al. Identity-aware CycleGAN for face photo-sketch synthesis and recognition [J]. Pattern Recognition, 2020, 102(6): 107249
|
[32] |
Radford A, Metz L, Chintala S. Unsupervised representation learning with deep convolutional generative adversarial networks [J]. arXiv preprint, arXiv: 1511.06434, 2015
|
[33] |
Ganin Y, Lempitsky V. Unsupervised domain adaptation by backpropagation [C] //Proc of the 32nd Int Conf on Machine Learning (ICML ’15). New York: ACM, 2015: 1180−1189
|
[34] |
Hoffer E, Ailon N. Deep metric learning using triplet network [C] //Proc of the 3rd Int Workshop on Similarity-Based Pattern Recognition. Berlin: Springer, 2015: 84−92
|
[35] |
Liang Xiaodan, Zhang Hao, Lin Liang, et al. Generative semantic manipulation with mask-contrasting GAN [C] //Proc of the 15th European Conf on Computer Vision (ECCV’18). Berlin: Springer, 2018: 574−590
|
[36] |
Xiong Wei, Luo Wenhan, Ma Lin, et al. Learning to generate time-lapse videos using multi-stage dynamic generative adversarial networks [C] //Proc of the 36th IEEE Conf on Computer Vision and Pattern Recognition (CVPR’18). Piscataway, NJ: IEEE, 2018: 2364−2373
|
[37] |
SCITECHDAILY. SciTechDaily [EB/OL]. [2022-01-01].https://scitechdaily.com/news/technology
|
[38] |
Costa J, Coviello E, Doyle G, et al. On the role of correlation and abstraction in cross-modal multimedia retrieval[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2014, 36(3): 521−535 doi: 10.1109/TPAMI.2013.142
|
[39] |
Peng Yuxin, Huang Xin, Qi Jinwei. Cross-media shared representation by hierarchical learning with multiple deep networks [C] //Proc of the 25th Int Joint Conf on Artificial Intelligence (IJCAI’16). Palo Alto, CA: AAAI, 2016: 3846−3853
|
[40] |
Ranjan V, Rasiwasia N, Jawahar C. Multi-label cross-modal retrieval [C] //Proc of the 15th IEEE Int Conf on Computer Vision (ICCV’15). Piscataway, NJ: IEEE, 2015: 4094−4102
|
[1] | Li Junwei, Liu Quan, Huang Zhigang, Xu Yapeng. A Diversity-Enriched Option-Critic Algorithm with Interest Functions[J]. Journal of Computer Research and Development, 2024, 61(12): 3108-3120. DOI: 10.7544/issn1000-1239.202220970 |
[2] | Zhao Rongmei, Sun Siyu, Yan Fanli, Peng Jian, Ju Shenggen. Multi-Interest Aware Sequential Recommender System Based on Contrastive Learning[J]. Journal of Computer Research and Development, 2024, 61(7): 1730-1740. DOI: 10.7544/issn1000-1239.202330622 |
[3] | Zhu Haiping, Wang Ziyu, Zhao Chengcheng, Chen Yan, Liu Jun, Tian Feng. Learning Resource Recommendation Method Based on Spatio-Temporal Multi-Granularity Interest Modeling[J]. Journal of Computer Research and Development. DOI: 10.7544/issn1000-1239.202440249 |
[4] | Liu Haijiao, Ma Huifang, Zhao Qiqi, Li Zhixin. Target Community Detection with User Interest Preferences and Influence[J]. Journal of Computer Research and Development, 2021, 58(1): 70-82. DOI: 10.7544/issn1000-1239.2021.20190775 |
[5] | Guo Kaihong, Han Hailong. Personalized Recommendation Model Based on Quantifier Induced by Preference[J]. Journal of Computer Research and Development, 2020, 57(1): 124-135. DOI: 10.7544/issn1000-1239.2020.20190166 |
[6] | Gao Ling, Gao Quanli, Wang Hai, Wang Wei, Yang Kang. A Preference Prediction Method Based on the Optimization of Basic Similarity Space Distribution[J]. Journal of Computer Research and Development, 2018, 55(5): 977-985. DOI: 10.7544/issn1000-1239.2018.20160924 |
[7] | Guo Chi, Wang Lina, Guan Yiping, Zhang Xiaoying. A Network Immunization Strategy Based on Dynamic Preference Scan[J]. Journal of Computer Research and Development, 2012, 49(4): 717-724. |
[8] | Zou Bowei, Zhang Yu, Fan Jili, Zheng Wei, and Liu Ting. Research on Personalized Information Retrieval Based on User’s New Interest Detection[J]. Journal of Computer Research and Development, 2009, 46(9): 1594-1600. |
[9] | Wang Zhenzhen, Xing Hancheng, and Chen Hanwu. On a Preference System of Agent and Its Construction[J]. Journal of Computer Research and Development, 2009, 46(2): 253-260. |
[10] | Wu Jing, Zhang Pin, Luo Xin, Sheng Hao, and Xiong Zhang. Mining Interests and Navigation Patterns in Personalization on Portal[J]. Journal of Computer Research and Development, 2007, 44(8): 1284-1292. |