Scientific and Technological Information Oriented Semantics-Adversarial and Media-Adversarial Based Cross-Media Retrieval Method

Li Ang; Du Junping; Kou Feifei; Xue Zhe; Xu Xin; Xu Mingying; Jiang Yang

doi:10.7544/issn1000-1239.202220430

Journal of Computer Research and Development > 2023 > 60(11): 2660-2670. > DOI: 10.7544/issn1000-1239.202220430

Li Ang, Du Junping, Kou Feifei, Xue Zhe, Xu Xin, Xu Mingying, Jiang Yang. Scientific and Technological Information Oriented Semantics-Adversarial and Media-Adversarial Based Cross-Media Retrieval Method[J]. Journal of Computer Research and Development, 2023, 60(11): 2660-2670. DOI: 10.7544/issn1000-1239.202220430

Citation:

PDF (2000 KB)

Scientific and Technological Information Oriented Semantics-Adversarial and Media-Adversarial Based Cross-Media Retrieval Method

School of Computer Science （National Pilot School of Software Engineering）, Beijing University of Posts and Telecommunications, Beijing 100876
Beijing Key Laboratory of Intelligent Telecommunication Software and Multimedia （Beijing University of Posts and Telecommunications） Beijing 100876

Funds: This work was supported by the Major Program of the National Natural Science Foundation of China (62192784) and the 8th Young Elite Scientists Sponsorship Program by CAST (2022QNRC001).

More Information

Author Bio:
Li Ang: born in 1993. PhD candidate. Member of CCF. His main research interests include information retrieval, data mining, and machine learning

Du Junping: born in 1963. Professor. Fellow of CCF. Her main research interests include artificial intelligence, machine learning, and pattern recognition

Kou Feifei: born in 1989. Lecturer. Member of CCF. Her main research interests include semantic learning and multimedia information processing

Xue Zhe: born in 1987. Associate professor. Member of CCF. His main research interests include machine learning, artificial intelligence, data mining, and image processing

Xu Xin: born in 1992. PhD. Member of CCF. Her main interests include knowledge graph, information retrieval, and machine learning

Xu Mingying: born in 1987. PhD. Member of CCF. Her main research interests include intelligent information retrieval, science & technology big data analysis, and data mining

Jiang Yang: born in 1995. Master. His main research interests include nature language processing, cross-media retrieval, and deep learning
Received Date: May 27, 2022
Revised Date: November 17, 2022
Available Online: May 03, 2023

Graphical Abstract

Abstract

Abstract

Cross-media retrieval of scientific and technological information is one of the important tasks in the cross-media study. Cross-media scientific and technological information retrieval obtains target information from massive multi-source and heterogeneous scientific and technological resources, which helps to design applications that meet users’ needs, including scientific and technological information recommendation, personalized scientific and technological information retrieval, etc. The core of cross-media retrieval is to learn a common subspace, in which data from different media can be directly compared with each other. In subspace learning, existing methods often focus on modeling the discrimination of intra-media data and the invariance of inter-media data after mapping, while ignoring semantic consistency within media and media discrimination within semantics, which limits the result of cross-media retrieval. In light of this, we propose a scientific and technological information oriented semantics-adversarial and media-adversarial cross-media retrieval method (SMCR) to find an effective common subspace. Specifically, SMCR minimizes the loss of inter-media semantic consistency in addition to modeling intra-media semantic discrimination, to preserve semantic similarity before and after mapping. Furthermore, SMCR constructs a basic feature mapping network and a refined feature mapping network to jointly minimize the media discriminative loss within semantics, to enhance the feature mapping network’s ability to confuse the media discriminant network. Experimental results on two datasets demonstrate that the proposed SMCR outperforms state-of-the-art methods in cross-media retrieval.
- cross-media retrieval,
- adversarial learning,
- scientific and technological information,
- media constraint,
- semantic consistency

FullText(HTML)

References (40)

References

[1]	方滨兴. 建设新时代网络空间安全产学研合作促进新态势[J]. 中国科技产业,2022,36(2):4−5 doi: 10.16277/j.cnki.cn11-2502/n.2022.02.003 Fang Binxing. Building a new era of cyberspace security, promoting new trends in industry university research cooperation[J]. Science & Technology Industry of China, 2022, 36(2): 4−5 (in Chinese) doi: 10.16277/j.cnki.cn11-2502/n.2022.02.003
[2]	Peng Yuxin, Huang Xin, Zhao Yunzhen. An overview of cross-media retrieval: Concepts, methodologies, benchmarks, and challenges[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2018, 28(9): 2372−2385 doi: 10.1109/TCSVT.2017.2705068
[3]	Kaur P, Pannu S, Malhi K. Comparative analysis on cross-modal information retrieval: A review [J]. Computer Science Review, 2021, 39（1）: 100336
[4]	冯霞,胡志毅,刘才华. 跨模态检索研究进展综述[J]. 计算机科学,2021,48(8):13−23 doi: 10.11896/jsjkx.200800165 Feng Xia, Hu Zhiyi, Liu Caihua. Survey of research progress on cross-modal retrieval[J]. Computer Science, 2021, 48(8): 13−23 (in Chinese) doi: 10.11896/jsjkx.200800165
[5]	刘兴波,聂秀山,尹义龙. 基于双向线性回归的监督离散跨模态散列方法[J]. 计算机研究与发展,2020,57(8):1707−1714 doi: 10.7544/issn1000-1239.2020.20200122 Liu Xingbo, Nie Xiushan, Yin Yilong. Mutual linear regression based supervised discrete cross-modal hashing[J]. Journal of Computer Research and Development, 2020, 57(8): 1707−1714 (in Chinese) doi: 10.7544/issn1000-1239.2020.20200122
[6]	Li Weiling, Tang Yong, Chen Guohua, et al. Implementation of academic news recommendation system based on user profile and message semantics [C] //Proc of the 9th Int Symp on Intelligence Computation and Applications. Berlin: Springer, 2017: 531−540
[7]	Salehi S, Du J, Ashman H. Examining personalization in academic web search [C] //Proc of the 26th ACM Conf on Hypertext & Social Media (HT’15). New York: ACM, 2015: 103−111
[8]	Wei Yunchao, Zhao Yao, Zhu Zhenfeng, et al. Modality-dependent cross-media retrieval[J]. ACM Transactions on Intelligent Systems and Technology, 2016, 7(4): 1−13
[9]	张璐,曹峰,梁新彦,等. 基于关联特征传播的跨模态检索[J]. 计算机研究与发展,2022,59(9):1993−2002 doi: 10.7544/issn1000-1239.20210475 Zhang Lu, Cao Feng, Liang Xinyan, et al. Cross-modal retrieval with correlation feature propagation[J]. Journal of Computer Research and Development, 2022, 59(9): 1993−2002 (in Chinese) doi: 10.7544/issn1000-1239.20210475
[10]	Wang Kaiye, He Ran, Wang Wei, et al. Learning coupled feature spaces for cross-modal matching [C] //Proc of the 13th IEEE Int Conf on Computer Vision (ICCV’13). Piscataway, NJ: IEEE, 2013: 2088−2095
[11]	Hu Weiming, Gao Jun, Li Bing, et al. Anomaly detection using local kernel density estimation and context-based regression[J]. IEEE Transactions on Knowledge and Data Engineering, 2018, 32(2): 218−233
[12]	Hardoon D, Szedmak S, Shawe-Taylor J. Canonical correlation analysis: An overview with application to learning methods[J]. Neural Computation, 2004, 16(12): 2639−2664 doi: 10.1162/0899766042321814
[13]	Wang Kaiye, He Ran, Wang Liang, et al. Joint feature selection and subspace learning for cross-modal retrieval[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2011, 38(10): 2010−2023
[14]	Zhai Xiaohua, Peng Yuxin, Xiao Jianguo. Learning cross-media joint representation with sparse and semisupervised regularization[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2014, 24(6): 965−978 doi: 10.1109/TCSVT.2013.2276704
[15]	Gong Yunchao, Ke Qifa, Isard M, et al. A multi-view embedding space for modeling internet images, tags, and their semantics[J]. International Journal of Computer Vision, 2014, 106(2): 210−233 doi: 10.1007/s11263-013-0658-4
[16]	Feng Fangxiang, Wang Xiaojie, Li Ruifan. Cross-modal retrieval with correspondence autoencoder [C] //Proc of the 22nd ACM Int Conf on Multimedia (MM’14). New York: ACM, 2014: 7−16
[17]	Yan Fei, Mikolajczyk K. Deep correlation for matching images and text [C] //Proc of the 33rd IEEE Conf on Computer Vision and Pattern Recognition (CVPR’15). Piscataway, NJ: IEEE, 2015: 3441−3450
[18]	Peng Yuxin, Huang Xin, Qi Jinwei. Cross-media shared representation by hierarchical learning with multiple deep networks [C] //Proc of the 25th Int Joint Conf on Artificial Intelligence (IJCAI’16). Palo Alto, CA: AAAI, 2016: 3846−3853
[19]	Kou Feifei, Du Junping, He Yijiang, et al. Social network search based on semantic analysis and learning[J]. CAAI Transactions on Intelligence Technology, 2016, 1(4): 293−302 doi: 10.1016/j.trit.2016.12.001
[20]	Xu Liang, Du Junping, Li Qingping. Image fusion based on nonsubsampled contourlet transform and saliency-motivated pulse coupled neural networks [J]. Mathematical Problems in Engineering, 2013, 19（1）: 135182
[21]	Ngiam J, Khosla A, Kim M, et al. Multimodal deep learning [C] //Proc of the 28th Int Conf on Machine Learning (ICML’11). New York: ACM, 2011: 689−696
[22]	He Li, Xu Xing, Lu Huimin, et al. Unsupervised cross-modal retrieval through adversarial learning [C] //Proc of the 18th IEEE Int Conf on Multimedia and Expo (ICME’17). Piscataway, NJ: IEEE, 2017: 1153−1158
[23]	Li Chao, Deng Cheng, Li Ning, et al. Self-supervised adversarial hashing networks for cross-modal retrieval [C] //Proc of the 36th IEEE Conf on Computer Vision and Pattern Recognition (CVPR’18). Piscataway, NJ: IEEE, 2018: 4242−4251
[24]	Wang Bokun, Yang Yang, Xu Xing, et al. Adversarial cross-modal retrieval [C] //Proc of the 25th ACM Int Conf on Multimedia (MM’17). New York: ACM, 2017: 154−162
[25]	Zhen Liangli, Hu Peng, Wang Xu, et al. Deep supervised cross-modal retrieval [C] //Proc of the 37th IEEE Conf on Computer Vision and Pattern Recognition (CVPR’19). Piscataway, NJ: IEEE, 2019: 10386−10395
[26]	刘翀,杜军平,周南. 一种基于对抗学习和语义相似度的社交网络跨媒体搜索方法[J]. 中国科学:信息科学,2021,51(5):779−794 doi: 10.1360/SSI-2019-0120 Liu Chong, Du Junping, Zhou Nan. A cross media search method for social networks based on adversarial learning and semantic similarity[J]. SCIENTIA SINICA Informationis, 2021, 51(5): 779−794 (in Chinese) doi: 10.1360/SSI-2019-0120
[27]	Goodfellow I, Pouget-Abadie J, Mirza M, et al. Generative adversarial nets [C] //Proc of the 27th Int Conf on Neural Information Processing Systems (NIPS’14). Cambridge, MA: MIT Press, 2014: 2672−2680
[28]	Li Chao, Deng Cheng, Li Ning, et al. Self-supervised adversarial hashing networks for cross-modal retrieval [C] //Proc of the 36th IEEE Conf on Computer Vision and Pattern Recognition (CVPR’18). Piscataway, NJ: IEEE, 2018: 4242−4251
[29]	Yu Chaohui, Wang Jindong, Chen Yiqiang, et al. Transfer learning with dynamic adversarial adaptation network [C] //Proc of the 19th IEEE Int Conf on Data Mining (ICDM’19). Piscataway, NJ: IEEE, 2019: 778−786
[30]	Xue Zhe, Du Junping, Du Dawei, et al. Deep low-rank subspace ensemble for multi-view clustering[J]. Information Sciences, 2019, 482(5): 210−227
[31]	Fang Yuke, Deng Weihong, Du Junping, et al. Identity-aware CycleGAN for face photo-sketch synthesis and recognition [J]. Pattern Recognition, 2020, 102（6）: 107249
[32]	Radford A, Metz L, Chintala S. Unsupervised representation learning with deep convolutional generative adversarial networks [J]. arXiv preprint, arXiv: 1511.06434, 2015
[33]	Ganin Y, Lempitsky V. Unsupervised domain adaptation by backpropagation [C] //Proc of the 32nd Int Conf on Machine Learning (ICML ’15). New York: ACM, 2015: 1180−1189
[34]	Hoffer E, Ailon N. Deep metric learning using triplet network [C] //Proc of the 3rd Int Workshop on Similarity-Based Pattern Recognition. Berlin: Springer, 2015: 84−92
[35]	Liang Xiaodan, Zhang Hao, Lin Liang, et al. Generative semantic manipulation with mask-contrasting GAN [C] //Proc of the 15th European Conf on Computer Vision (ECCV’18). Berlin: Springer, 2018: 574−590
[36]	Xiong Wei, Luo Wenhan, Ma Lin, et al. Learning to generate time-lapse videos using multi-stage dynamic generative adversarial networks [C] //Proc of the 36th IEEE Conf on Computer Vision and Pattern Recognition (CVPR’18). Piscataway, NJ: IEEE, 2018: 2364−2373
[37]	SCITECHDAILY. SciTechDaily [EB/OL]. [2022-01-01].https://scitechdaily.com/news/technology
[38]	Costa J, Coviello E, Doyle G, et al. On the role of correlation and abstraction in cross-modal multimedia retrieval[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2014, 36(3): 521−535 doi: 10.1109/TPAMI.2013.142
[39]	Peng Yuxin, Huang Xin, Qi Jinwei. Cross-media shared representation by hierarchical learning with multiple deep networks [C] //Proc of the 25th Int Joint Conf on Artificial Intelligence (IJCAI’16). Palo Alto, CA: AAAI, 2016: 3846−3853
[40]	Ranjan V, Rasiwasia N, Jawahar C. Multi-label cross-modal retrieval [C] //Proc of the 15th IEEE Int Conf on Computer Vision (ICCV’15). Piscataway, NJ: IEEE, 2015: 4094−4102

[1]	Li Junwei, Liu Quan, Huang Zhigang, Xu Yapeng. A Diversity-Enriched Option-Critic Algorithm with Interest Functions[J]. Journal of Computer Research and Development, 2024, 61(12): 3108-3120. DOI: 10.7544/issn1000-1239.202220970
[2]	Zhao Rongmei, Sun Siyu, Yan Fanli, Peng Jian, Ju Shenggen. Multi-Interest Aware Sequential Recommender System Based on Contrastive Learning[J]. Journal of Computer Research and Development, 2024, 61(7): 1730-1740. DOI: 10.7544/issn1000-1239.202330622
[3]	Zhu Haiping, Wang Ziyu, Zhao Chengcheng, Chen Yan, Liu Jun, Tian Feng. Learning Resource Recommendation Method Based on Spatio-Temporal Multi-Granularity Interest Modeling[J]. Journal of Computer Research and Development. DOI: 10.7544/issn1000-1239.202440249
[4]	Liu Haijiao, Ma Huifang, Zhao Qiqi, Li Zhixin. Target Community Detection with User Interest Preferences and Influence[J]. Journal of Computer Research and Development, 2021, 58(1): 70-82. DOI: 10.7544/issn1000-1239.2021.20190775
[5]	Guo Kaihong, Han Hailong. Personalized Recommendation Model Based on Quantifier Induced by Preference[J]. Journal of Computer Research and Development, 2020, 57(1): 124-135. DOI: 10.7544/issn1000-1239.2020.20190166
[6]	Gao Ling, Gao Quanli, Wang Hai, Wang Wei, Yang Kang. A Preference Prediction Method Based on the Optimization of Basic Similarity Space Distribution[J]. Journal of Computer Research and Development, 2018, 55(5): 977-985. DOI: 10.7544/issn1000-1239.2018.20160924
[7]	Guo Chi, Wang Lina, Guan Yiping, Zhang Xiaoying. A Network Immunization Strategy Based on Dynamic Preference Scan[J]. Journal of Computer Research and Development, 2012, 49(4): 717-724.
[8]	Zou Bowei, Zhang Yu, Fan Jili, Zheng Wei, and Liu Ting. Research on Personalized Information Retrieval Based on User’s New Interest Detection[J]. Journal of Computer Research and Development, 2009, 46(9): 1594-1600.
[9]	Wang Zhenzhen, Xing Hancheng, and Chen Hanwu. On a Preference System of Agent and Its Construction[J]. Journal of Computer Research and Development, 2009, 46(2): 253-260.
[10]	Wu Jing, Zhang Pin, Luo Xin, Sheng Hao, and Xiong Zhang. Mining Interests and Navigation Patterns in Personalization on Portal[J]. Journal of Computer Research and Development, 2007, 44(8): 1284-1292.