• 中国精品科技期刊
  • CCF推荐A类中文期刊
  • 计算领域高质量科技期刊T1类
Advanced Search
Peng Yuxin, Qi Jinwei, Huang Xin. Current Research Status and Prospects on Multimedia Content Understanding[J]. Journal of Computer Research and Development, 2019, 56(1): 183-208. DOI: 10.7544/issn1000-1239.2019.20180770
Citation: Peng Yuxin, Qi Jinwei, Huang Xin. Current Research Status and Prospects on Multimedia Content Understanding[J]. Journal of Computer Research and Development, 2019, 56(1): 183-208. DOI: 10.7544/issn1000-1239.2019.20180770

Current Research Status and Prospects on Multimedia Content Understanding

More Information
  • Published Date: December 31, 2018
  • With the rapid development of multimedia and Internet technologies, a large amount of multimedia data has been rapidly emerging, such as image, video, text and audio. Data of different media types from multi-source is heterogeneous in the form but relevant in the semantic. As indicated in the research of cognitive science, the perception and cognition of the environment is through the fusion across different sensory organs of human, which is decided by the human brain’s organization structure. Therefore, it has been a key challenge to perform data semantic analysis and correlation modeling across different media types, for achieving comprehensive multimedia content understanding, which has drawn wide interests of both academic and industrial areas. In this paper, the basic concepts, representative methods and research status of 5 latest highlighting research topics of multimedia content understanding are referred, including fine-grained image classification and retrieval, video classification and object detection, cross-media retrieval, visual description and generation, and visual question answering. This paper further presents the major challenges of multimedia content understanding, as well as gives the development trend in the future. The goal of this paper is to help readers get a comprehensive understanding on the research status of multimedia content understanding, draw more attention of researchers to relevant research topics, and provide the technical insights to promote further development of this area.
  • Related Articles

    [1]Tian Xuan, Wu Zhichao. Review of Knowledge Base Question Answering Based on Information Retrieval[J]. Journal of Computer Research and Development. DOI: 10.7544/issn1000-1239.202331013
    [2]Liu Mingyang, Wang Ruomei, Zhou Fan, Lin Ge. Video Question Answering Scheme Base on Multimodal Knowledge Active Learning[J]. Journal of Computer Research and Development, 2024, 61(4): 889-902. DOI: 10.7544/issn1000-1239.202221008
    [3]Bao Cuizhu, Ding Kai, Dong Jianfeng, Yang Xun, Xie Mande, Wang Xun. Research Progress of Video Question Answering Technologies[J]. Journal of Computer Research and Development, 2024, 61(3): 639-673. DOI: 10.7544/issn1000-1239.202220294
    [4]Zhang Yingying, Qian Shengsheng, Fang Quan, Xu Changsheng. Multi-Modal Knowledge-Aware Attention Network for Question Answering[J]. Journal of Computer Research and Development, 2020, 57(5): 1037-1045. DOI: 10.7544/issn1000-1239.2020.20190474
    [5]Li Shengdong, Lü Xueqiang. Static Restart Stochastic Gradient Descent Algorithm Based on Image Question Answering[J]. Journal of Computer Research and Development, 2019, 56(5): 1092-1100. DOI: 10.7544/issn1000-1239.2019.20180472
    [6]Wang Yilei, Zhuo Yifan, Wu Yingjie, Chen Mingqin. Question Answering Algorithm on Image Fragmentation Information Based on Deep Neural Network[J]. Journal of Computer Research and Development, 2018, 55(12): 2600-2610. DOI: 10.7544/issn1000-1239.2018.20180606
    [7]Yu Jun, Wang Liang, Yu Zhou. Research on Visual Question Answering Techniques[J]. Journal of Computer Research and Development, 2018, 55(9): 1946-1958. DOI: 10.7544/issn1000-1239.2018.20180168
    [8]Han Zhao, Miao Duoqian, Ren Fuji, Zhang Hongyun. Rough Set Knowledge Discovery Based Open Domain Chinese Question Answering Retrieval[J]. Journal of Computer Research and Development, 2018, 55(5): 958-967. DOI: 10.7544/issn1000-1239.2018.20170232
    [9]Jiang Shuqiang, Min Weiqing, Wang Shuhui. Survey and Prospect of Intelligent Interaction-Oriented Image Recognition Techniques[J]. Journal of Computer Research and Development, 2016, 53(1): 113-122. DOI: 10.7544/issn1000-1239.2016.20150689
    [10]Hou Yongshuai, Zhang Yaoyun, Wang Xiaolong, Chen Qingcai, Wang Yuliang, and Hu Baotian. Recognition and Retrieval of Time-sensitive Question in Chinese QA System[J]. Journal of Computer Research and Development, 2013, 50(12): 2612-2620.
  • Cited by

    Periodical cited type(33)

    1. 陈磊,习怡萌,刘立波. 视频文本跨模态检索研究综述. 计算机工程与应用. 2024(04): 1-20 .
    2. 彭贤哲,郑建明,李佳新,石进. 目录学思想在数据结构化过程的传承与应用. 图书情报知识. 2024(01): 80-91 .
    3. 何柳,安然,刘姝妍,李润岐,陶剑,曾照洋. 基于知识图谱的航空多模态数据组织与知识发现技术研究. 图学学报. 2024(02): 300-307 .
    4. 沙洲,赵屹. 跨媒体智能技术及其在网络档案信息检索中的应用. 兰台世界. 2024(07): 43-48 .
    5. 张力,陈康,孙光辉. 实值无标签图文跨模态检索研究综述. 哈尔滨工业大学学报. 2024(09): 1-16 .
    6. 付燕,王咪咪,叶鸥. 基于场景表示中对象特征语法分析的视频描述. 计算机工程与设计. 2023(02): 488-493 .
    7. 王敬,张群. 大数据环境下图书馆跨媒体知识服务模式研究. 图书馆. 2023(08): 39-46 .
    8. 郭聃,崔中良. 具身智能何以可能?——从意象图式视角分析. 科学技术哲学研究. 2023(05): 51-57 .
    9. 陈志奎,蒋昆仑,钟芳明,原旭,张尧臣. 联合模态语义相似度修正的无监督跨模态哈希. 小型微型计算机系统. 2023(10): 2204-2211 .
    10. 李耕,王梓烁,何相腾,彭宇新. 从ChatGPT到多模态大模型:现状与未来. 中国科学基金. 2023(05): 724-734 .
    11. 张承德,刘雨宣,肖霞,梅凯. 跨媒体语义关联增强的网络视频热点话题检测. 计算机研究与发展. 2023(11): 2624-2637 . 本站查看
    12. 曹建军,聂子博,郑奇斌,吕国俊,曾志贤. 跨模态数据实体分辨研究综述. 软件学报. 2023(12): 5822-5847 .
    13. 杨洋. 多媒体信息处理中人工智能方法应用的研究热点及主题演化. 电脑知识与技术. 2022(02): 102-103 .
    14. 王惠峰,张峰,张昆,王子玮,白立飞,葛建军,张德. 基于内容的视频高性能处理框架设计. 指挥信息系统与技术. 2022(02): 85-90 .
    15. 曾志贤,曹建军,翁年凤,蒋国权,范强. 结合关键帧提取的视频-文本跨模态实体分辨双重编码方法. 兵工学报. 2022(05): 1107-1116 .
    16. 曾志贤,曹建军,翁年凤,蒋国权,徐滨. 基于注意力机制的细粒度语义关联视频-文本跨模态实体分辨. 计算机科学. 2022(07): 106-112 .
    17. 代瑾,陈莹. 联合线性判别和图正则的任务导向型跨模态检索. 计算机辅助设计与图形学学报. 2021(01): 106-115 .
    18. 王正,吴斌,王文哲,滕一阳,帅杰,肖云鹏,白婷. 基于图像和视频信息的社交关系理解研究综述. 计算机学报. 2021(06): 1168-1199 .
    19. 王春哲,安军社,姜秀杰,邢笑雪,崔天舒. 融合神经网络与超像素的候选区域优化算法. 国防科技大学学报. 2021(04): 145-155 .
    20. 付燕,马钰,叶鸥. 融合深度学习和视觉文本的视频描述方法. 科学技术与工程. 2021(14): 5855-5861 .
    21. 王金婉,朱学芳. 迁移学习在信息资源开发及服务中的应用探索. 情报理论与实践. 2021(07): 145-151 .
    22. 李志欣,魏海洋,张灿龙,马慧芳,史忠植. 图像描述生成研究进展. 计算机研究与发展. 2021(09): 1951-1974 . 本站查看
    23. 滕少华,郭兰君,张巍,滕璐瑶. 一种标签嵌入子空间的跨模态离散哈希学习. 江西师范大学学报(自然科学版). 2021(03): 305-313 .
    24. 郑奇斌,刁兴春,王彦臻,曹建军,刘艺,秦伟. 跨模态检索中的相似性漂移问题. 国防科技大学学报. 2021(05): 99-106 .
    25. 李春芳,刘永久,王楷翔,杨睿,张凌飞,李敏,邓智铭,石民勇. 一种多模态跨媒体检索的融媒体影视系统. 中国传媒大学学报(自然科学版). 2021(04): 63-71 .
    26. 吕国俊,曹建军,郑奇斌,常宸,翁年凤. 基于结构保持对抗网络的跨模态实体分辨. 南京大学学报(自然科学). 2020(02): 197-205 .
    27. 张宇,闫幸. 智能化普适多媒体服务模式与支持技术研究. 新媒体研究. 2020(13): 25-28 .
    28. 张彩虹,刘慧敏,龚玉枝,黄红艳,魏婷,夏明,刘娟,曾永孝,郑晓丹. 视频微课健康教育模式在压力性损伤患者居家照顾者中的应用. 护理学杂志. 2020(21): 12-15 .
    29. 刘欢,郑庆华,罗敏楠,赵洪科,肖阳,吕彦章. 基于跨域对抗学习的零样本分类. 计算机研究与发展. 2019(12): 2521-2535 . 本站查看
    30. 常致富,周风余,王玉刚,沈冬冬,赵阳. 基于深度学习的图像自动标注方法综述. 山东大学学报(工学版). 2019(06): 25-35 .
    31. Kui-long LIU,Wei LI,Chang-yuan YANG,Guang YANG. 智能多媒体内容设计在阿里巴巴的应用(英文). Frontiers of Information Technology & Electronic Engineering. 2019(12): 1657-1665 .
    32. 黄樱,牛保宁,关虎,张树武. 基于图像纹理的自适应水印算法. 北京航空航天大学学报. 2019(12): 2403-2414 .
    33. 周燕,曾凡智,吴臣,罗粤,刘紫琴. 基于深度学习的三维形状特征提取方法. 计算机科学. 2019(09): 47-58 .

    Other cited types(59)

Catalog

    Article views (2383) PDF downloads (1278) Cited by(92)

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return