Citation: | Zhang Chengde, Liu Yuxuan, Xiao Xia, Mei Kai. Hot Topic Detection of Web Video Based on Cross-Media Semantic Association Enhancement[J]. Journal of Computer Research and Development, 2023, 60(11): 2624-2637. DOI: 10.7544/issn1000-1239.202220560 |
Cross-media web video hot topic detection has become a new research hotspot. However, there is less text information to describe video, which makes the space of text semantic features sparse, resulting in weak correlation between text semantic features, which increases the difficulty of mining hot topics. The existing methods mainly enrich the text semantic feature space through visual information. However, due to the heterogeneity between visual and text information, the semantic features of text and visual are quite different under the same topic. This further reduces the correlation strength between text semantics under the same topic, and also brings great challenges to cross-media hot topic detection based on web videos. Therefore, we propose a new cross-media semantic association enhancement method. Firstly, the core semantic features of the text from the word and sentence levels through double-layer attention are captured; Secondly, by understanding the visual content, a large number of text descriptions highly related to the video content are generated to enrich the text semantic space; Then, through text semantic similarity and visual semantic similarity, the text semantic map and visual semantic map are constructed, and the time decay function is constructed to establish the correlation between cross-media data from the time dimension, so as to enhance the correlation strength between text and visual semantics, and smoothly fuse the two semantic maps into a hybrid semantic map to realize cross-media semantic complementarity; Finally, hot topics are detected by graph clustering method. A large number of experimental results show that the proposed model is superior to the existing methods.
[1] |
彭宇新,綦金玮,黄鑫. 多媒体内容理解的研究现状与展望[J]. 计算机研究与发展,2019,56(1):183−208 doi: 10.7544/issn1000-1239.2019.20180770
Peng Yuxin, Qi Jinwei, Huang Xin. Current research status and prospects on multimedia content understanding[J]. Journal of Computer Research and Development, 2019, 56(1): 183−208 (in Chinese) doi: 10.7544/issn1000-1239.2019.20180770
|
[2] |
中国互联网络信息中心(CNNIC). 第48次中国互联网络发展现状统计报告[EB/OL]. 北京: 中国互联网络信息中心, 2021 [2022-09-11]. http: //n2.sinaimg.cn/finance/a2d36afe/20210827/FuJian1.pdf
China Internet Network Information Center. The 48th statistical report on the development status of China’s Internet[EB/OL]. Beijing: CNNIC, 2021 [2022-09-11]. http://n2.sinaimg.cn/finance/a2d36afe/20210827/FuJian1.pdf (in Chinese)
|
[3] |
Amudha S, Niveditha V R, Kumar P S R, et al. YouTube trending video metadata analysis using machine learning[J]. International Journal of Advanced Science and Technology, 2020, 29(7): 3028−3037
|
[4] |
Pervaiz R, Aloufi K, Zaidi S S R, et al. A methodology to identify topic of video via n-gram approach[J]. International Journal of Computer Science and Network Security, 2020, 20(1): 79−94
|
[5] |
Pang Junbiao, Hu Anjing, Huang Qingming, et al. Increasing interpretation of web topic detection via prototype learning from sparse poisson deconvolution[J]. IEEE Transactions on Cybernetics, 2018, 49(3): 1072−1083
|
[6] |
史存会,胡耀康,冯彬,等. 舆情场景下基于层次知识的话题推荐方法[J]. 计算机研究与发展,2021,58(8):1811−1819 doi: 10.7544/issn1000-1239.2021.20190749
Shi Cunhui, Hu Yaokang, Feng Bin, et al. A hierarchical knowledge based topic recommendation method in public opinion scenario[J]. Journal of Computer Research and Development, 2021, 58(8): 1811−1819 (in Chinese) doi: 10.7544/issn1000-1239.2021.20190749
|
[7] |
Xu Xiaoying, Dutta K, Ge C. Do adjective features from user reviews address sparsity and transparency in recommender systems[J]. Electronic Commerce Research and Applications, 2018, 29: 113−123 doi: 10.1016/j.elerap.2018.04.002
|
[8] |
崔婉秋,杜军平,寇菲菲,等. 面向微博短文本的社交与概念化语义扩展搜索方法[J]. 计算机研究与发展,2018,55(8):1641−1652 doi: 10.7544/issn1000-1239.2018.20180363
Cui Wanqiu, Du Junping, Kou Feifei, et al. The social and conceptual semantic extended search method for microblog short text[J]. Journal of Computer Research and Development, 2018, 55(8): 1641−1652 (in Chinese) doi: 10.7544/issn1000-1239.2018.20180363
|
[9] |
Liu Hairong, Yan Shuicheng. Robust graph mode seeking by graph shift[C]// Proc of the Int Conf on Machine Learning. New York: ACM, 2010: 671−678
|
[10] |
Li Wengen, Zhao Jiabao. TextRank algorithm by exploiting Wikipedia for short text keywords extraction[C]// Proc of the 3rd Int Conf on Information Science and Control Engineering (ICISCE). Piscataway, NJ: IEEE, 2016: 683−686
|
[11] |
Cao Juan, Ngo C W, Zhang Yongdong, et al. Tracking web video topics: Discovery, visualization, and monitoring[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2011, 21(12): 1835−1846 doi: 10.1109/TCSVT.2011.2148470
|
[12] |
Liu Wei, Li Weimin, Lei Jiang, et al. Topic detection and tracking based on event ontology[J]. IEEE Access, 2020, 8: 98044−98056 doi: 10.1109/ACCESS.2020.2995776
|
[13] |
Liu Tianpeng, Xue Feng, Sun Jian, et al. A survey of event analysis and mining from social multimedia[J]. Multimedia Tools and Applications, 2020, 79(45): 33431−33448
|
[14] |
Blei D M, Ng A Y, Jordan M I. Latent Dirichlet allocation[J]. The Journal of Machine Learning Research, 2003, 3: 993−1022
|
[15] |
Teh Y W, Jordan M I, Beal M J, et al. Hierarchical Dirichlet processes[J]. Journal of the American Statistical Association, 2006, 101(476): 1566−1581 doi: 10.1198/016214506000000302
|
[16] |
Jin Ou, Liu N N, Zhao Kai, et al. Transferring topical knowledge from auxiliary long texts for short text clustering[C]//Proc of the 20th ACM Int Conf on Information and Knowledge Management. New York: ACM, 2011: 775−784
|
[17] |
Yan Xiaohui, Guo Jiafeng, Lan Yanyan, et al. A biterm topic model for short texts[C]//Proc of the 22nd Int Conf on World Wide Web. New York: ACM, 2013: 1445−1456
|
[18] |
Zhao Zhicheng, Xiang Rui, Su Fei. Complex event detection via attention-based video representation and classification[J]. Multimedia Tools and Applications, 2018, 77(3): 3209−3227 doi: 10.1007/s11042-017-5058-2
|
[19] |
Zhang Jiyong, Li Wenchao, Li Liang wenchao, et al. Enabling 5G: Sentimental image dominant graph topic model for cross-modality topic detection[J]. Wireless Networks, 2020, 26(3): 1549−1561 doi: 10.1007/s11276-019-02009-3
|
[20] |
Zhao Sicheng, Gao Yue, Ding Guiguang, et al. Real-time multimedia social event detection in microblog[J]. IEEE Transactions on Cybernetics, 2017, 48(11): 3218−3231
|
[21] |
Zhang Chengde, Lu Shaozhen, Zhang Chengming, et al. A novel hot topic detection framework with integration of image and short text information from Twitter[J]. IEEE Access, 2018, 7: 9225−9231
|
[22] |
Kojima A, Izumi M, Tamura T, et al. Generating natural language description of human behavior from video images[C]//Proc of the 15th Int Conf on Pattern Recognition. Piscataway, NJ: IEEE, 2000: 728−731
|
[23] |
Kojima A, Tamura T, Fukunaga K. Natural language description of human activities from video images based on concept hierarchy of Actions[J]. International Journal of Computer Vision, 2002, 50(2): 171−184 doi: 10.1023/A:1020346032608
|
[24] |
Venugopalan S, Xu Huijuan, Donahue J, et al. Translating videos to natural language using deep recurrent neural networks[C] //Proc of the 2015 Conf of the North American Chapter of The Association for Computational Linguistics: Human Language Technologies. Stroudsburg, PA: ACL, 2015: 1494−1504
|
[25] |
Li Yao, Torabi A, Cho K, et al. Describing videos by exploiting temporal structure[C]//Proc of the IEEE Int Conf on Computer Vision (ICCV). Piscataway, NJ: IEEE, 2015: 4507−4515
|
[26] |
Zhang Junchao, Peng Yuxin. Object-aware aggregation with bidirectional temporal graph for video captioning[C]//Proc of the IEEE/CVF Conf on Computer Vision and Pattern Recognition. Piscataway, NJ: IEEE, 2019: 8327−8336
|
[27] |
Aafaq N, Akhtar N, Liu Wei, et al. Spatio-temporal dynamics and semantic attribute enriched visual encoding for video captioning[C]//Proc of the IEEE/CVF Conf on Computer Vision and Pattern Recognition. Piscataway, NJ: IEEE, 2019: 12487−12496
|
[28] |
Mishra A, Shekhar S, Singh A K, et al. Ocr-vqa: Visual question answering by reading text in images[C] // Proc of the 2019 Int Conf on Document Analysis and Recognition (ICDAR). Piscataway, NJ: IEEE, 2019: 947−952
|
[29] |
Wang Yue, Li Jing, Lyu M R, et al. Cross-Media keyphrase prediction: A unified framework with multi-modality multi-head attention and image wordings[C]//Proc of the 2020 Conf on Empirical Methods in Natural Language Processing. Stroudsburg, PA: ACL, 2020: 3311−3324
|
[30] |
Zhong Qinghong, Qiao Xiaodong, Zhang Yunliang, et al. Cross-media fusion method based on LDA2Vec and residual network[J]. Data Analysis and Knowledge Discovery, 2019, 3(10): 78−88
|
[31] |
Ying Long, Yu Hui, Wang Jinguang, et al. Fake news detection via multi-modal topic memory network[J]. IEEE Access, 2021, 9: 132818−132829 doi: 10.1109/ACCESS.2021.3113981
|
[32] |
Xue Junxiao, Wang Yabo, Tian Yichen, et al. Detecting fake news by exploring the consistency of multimodal data[J/OL]. Information Processing & Management, 2021 [2022-09-11].https://doi.org/10.1016/j.ipm.2021.102610
|
[33] |
Li Chuanzhen, Liu Minqiao, Cai Juanjuan, et al. Topic detection and tracking based on windowed DBSCAN and parallel KNN[J]. IEEE Access, 2020, 9: 3858−3870
|
[34] |
Yang Zichao, Yang Diyi, Dyer C, et al. Hierarchical attention networks for document classification[C]//Proc of the Int Conf of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Stroudsburg, PA: ACL, 2016: 1480−1489
|
[35] |
Mikolov T, Chen K, Corrado G, et al. Efficient estimation of word representations in vector space[C/OL]//Proc of the 1st Int Conf on Learning Representations. Ithaca, NY, Cornell University, 2013 [2022-09-11].https://doi.org/10.48550/arXiv.1301.3781
|
[36] |
Schuster M, Paliwal K K. Bidirectional recurrent neural networks[J]. IEEE Transactions on Signal Processing, 1997, 45(11): 2673−2681 doi: 10.1109/78.650093
|
[37] |
Huang Jie, Zhou Wengang, Li Houqiang, et al. Attention-based 3D-CNNs for large-vocabulary sign language recognition[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2019, 29(9): 2822−2832 doi: 10.1109/TCSVT.2018.2870740
|
[38] |
Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition[C/OL] //Proc of the 3rd Int Conf on Learning Representations(ICLR). Ithaca, NY, Cornell University, 2015 [2022-09-11].https://doi.org/10.48550/arXiv.1409.1556
|
[39] |
Chen D L, Dolan W B. Collecting highly paralleldata for paraphrase evaluation[C]//Proc of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies. Stroudsburg, PA: ACL, 2011: 190−200
|
[40] |
Guru D S, Suhil M. Histogram based split and merge framework for shot boundary detection[M]//Mining Intelligence and Knowledge Exploration. Cham, Switzerland: Springer, 2013: 180−191
|
[41] |
Vahidnia S, Abbasi A, Abbass H A. Embedding-based detection and extraction of research topics from academic documents using deep clustering[J]. Journal of Data and Information Science, 2021, 6(3): 99−122 doi: 10.2478/jdis-2021-0024
|
[42] |
Zhou Peng, Shi Wei, Tian Jun, et al. Attention-based bidirectional long short-term memory networks for relation classification[C]//Proc of the 54th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, PA: ACL, 2016: 207−212
|
[43] |
Jian Muwei, Wang Jiaojin, Yu Hui, et al. Integrating object proposal with attention networks for video saliency detection[J]. Information Sciences, 2021, 576: 819−830 doi: 10.1016/j.ins.2021.08.069
|
[44] |
Yang Shuang, Tang Yan. News topic detection based on capsule semantic graph[J]. Big Data Mining and Analytics, 2022, 5(2): 98−109 doi: 10.26599/BDMA.2021.9020023
|
[45] |
Asgari-Chenaghlu M, Feizi-Derakhshi M R, Balafar M A, et al. TopicBERT: A cognitive approach for topic detection from multimodal post stream using BERT and memory–graph[J/OL]. Chaos, Solitons & Fractals, 2021 [2022-09-11].https://doi.org/10.1016/j.chaos.2021.111274
|
[46] |
Zhang Huakui, Cai Yi, Zhu Bingshan, et al. Multimodal topic modeling by exploring characteristics of short text social media[J/OL]. IEEE Transactions on Multimedia, 2022([2022-09-11].https://ieeexplore.ieee.org/abstract/document/9696359
|
[1] | Peng Yingtao, Meng Xiaofeng, Du Zhijuan. Survey on Diversified Recommendation[J]. Journal of Computer Research and Development, 2025, 62(2): 285-313. DOI: 10.7544/issn1000-1239.202330600 |
[2] | MB-HGCN: A Hierarchical Graph Convolutional Network for Multi-behavior Recommendation“CCIR 2024推荐”[J]. Journal of Computer Research and Development. DOI: 10.7544/issn1000-1239.202440770 |
[3] | Zeng Weixin, Zhao Xiang, Tang Jiuyang, Tan Zhen, Wang Wei. Iterative Entity Alignment via Re-Ranking[J]. Journal of Computer Research and Development, 2020, 57(7): 1460-1471. DOI: 10.7544/issn1000-1239.2020.20190643 |
[4] | Dai Chenchao, Wang Hongyuan, Ni Tongguang, Chen Shoubing. Person Re-Identification Based on Deep Convolutional Generative Adversarial Network and Expanded Neighbor Reranking[J]. Journal of Computer Research and Development, 2019, 56(8): 1632-1641. DOI: 10.7544/issn1000-1239.2019.20190195 |
[5] | Gu Liang, Yang Peng, Dong Yongqiang. A Diversified Recommendation Method for UCL in Broadcast-Storage Network[J]. Journal of Computer Research and Development, 2017, 54(8): 1631-1643. DOI: 10.7544/issn1000-1239.2017.20170128 |
[6] | Meng Xiangfu, Bi Chongchun, Zhang Xiaoyan, Tang Xiaoliang, Tang Yanhuan. Web Database top-k Diverse Keyword Query Suggestion Approach[J]. Journal of Computer Research and Development, 2017, 54(7): 1577-1591. DOI: 10.7544/issn1000-1239.2017.20160005 |
[7] | Yu Wenzhe, Sha Chaofeng, He Xiaofeng, Zhang Rong. Review Selection Considering Opinion Diversity[J]. Journal of Computer Research and Development, 2015, 52(5): 1050-1060. DOI: 10.7544/issn1000-1239.2015.20131932 |
[8] | Wang Xianghai, Cong Zhihuan, Fang Lingling, Song Chuanming. HMM Training Model Using Blending Population Diversity Based Aaptive Genetic Algorithm Title[J]. Journal of Computer Research and Development, 2014, 51(8): 1833-1844. DOI: 10.7544/issn1000-1239.2014.20121211 |
[9] | Zhang Weiguo, Yin Xia, and Wu Jianping. A Computation Method of Path Diversity Based on AS Relationships[J]. Journal of Computer Research and Development, 2012, 49(1): 167-173. |
[10] | Han Jianmin, Yu Juan, Yu Huiqun, Jia Jiong. A Multi-Level l-Diversity Model for Numerical Sensitive Attributes[J]. Journal of Computer Research and Development, 2011, 48(1): 147-158. |
1. |
罗宇哲,李玲,侯朋朋,于佳耕,程丽敏,张常有,武延军,赵琛. 面向AIoT的协同智能综述. 计算机研究与发展. 2025(01): 179-206 .
![]() | |
2. |
王蕴,林霄,楼芝兰,李军,孙卫强. 面向边缘光算力网络的上行链路资源协同调度算法. 光通信技术. 2024(03): 45-51 .
![]() | |
3. |
王铭源,王正国,李济顺,薛玉君. 层级式机械装备健康指数模型及管理系统构建. 金属矿山. 2024(09): 198-206 .
![]() | |
4. |
王睿,王岩,尹朴,齐建鹏,孙叶桃,李倩,张易达,张梅奎. 面向边缘智能的协同训练研究进展. 工程科学学报. 2023(08): 1400-1416 .
![]() | |
5. |
薛建强,史彦军,李波. 面向无人集群的边缘计算技术综述. 兵工学报. 2023(09): 2546-2555 .
![]() | |
6. |
阴彦磊,王立华,廖伟智,张万达. 融合GRU-Attention与鲸鱼算法的流程制造工艺参数云边联动优化. 计算机集成制造系统. 2023(09): 2991-3005 .
![]() | |
7. |
许浩,朱晓娟. SDN中基于模型划分的云边协同推理算法. 兰州工业学院学报. 2023(06): 31-37 .
![]() |