Current Research Status and Prospects on Multimedia Content Understanding

Peng Yuxin; Qi Jinwei; Huang Xin

doi:10.7544/issn1000-1239.2019.20180770

Journal of Computer Research and Development > 2019 > 56(1): 183-208. > DOI: 10.7544/issn1000-1239.2019.20180770

Peng Yuxin, Qi Jinwei, Huang Xin. Current Research Status and Prospects on Multimedia Content Understanding[J]. Journal of Computer Research and Development, 2019, 56(1): 183-208. DOI: 10.7544/issn1000-1239.2019.20180770

Citation:

PDF (6109 KB)

Current Research Status and Prospects on Multimedia Content Understanding

(Institute of Computer Science and Technology, Peking University, Beijing 100871)

More Information

Published Date: December 31, 2018

Graphical Abstract

Abstract

Abstract

With the rapid development of multimedia and Internet technologies, a large amount of multimedia data has been rapidly emerging, such as image, video, text and audio. Data of different media types from multi-source is heterogeneous in the form but relevant in the semantic. As indicated in the research of cognitive science, the perception and cognition of the environment is through the fusion across different sensory organs of human, which is decided by the human brain’s organization structure. Therefore, it has been a key challenge to perform data semantic analysis and correlation modeling across different media types, for achieving comprehensive multimedia content understanding, which has drawn wide interests of both academic and industrial areas. In this paper, the basic concepts, representative methods and research status of 5 latest highlighting research topics of multimedia content understanding are referred, including fine-grained image classification and retrieval, video classification and object detection, cross-media retrieval, visual description and generation, and visual question answering. This paper further presents the major challenges of multimedia content understanding, as well as gives the development trend in the future. The goal of this paper is to help readers get a comprehensive understanding on the research status of multimedia content understanding, draw more attention of researchers to relevant research topics, and provide the technical insights to promote further development of this area.

FullText(HTML)

References (0)

[1]	Tian Xuan, Wu Zhichao. Review of Knowledge Base Question Answering Based on Information Retrieval[J]. Journal of Computer Research and Development, 2025, 62(2): 314-335. DOI: 10.7544/issn1000-1239.202331013
[2]	Liu Mingyang, Wang Ruomei, Zhou Fan, Lin Ge. Video Question Answering Scheme Base on Multimodal Knowledge Active Learning[J]. Journal of Computer Research and Development, 2024, 61(4): 889-902. DOI: 10.7544/issn1000-1239.202221008
[3]	Bao Cuizhu, Ding Kai, Dong Jianfeng, Yang Xun, Xie Mande, Wang Xun. Research Progress of Video Question Answering Technologies[J]. Journal of Computer Research and Development, 2024, 61(3): 639-673. DOI: 10.7544/issn1000-1239.202220294
[4]	Zhang Yingying, Qian Shengsheng, Fang Quan, Xu Changsheng. Multi-Modal Knowledge-Aware Attention Network for Question Answering[J]. Journal of Computer Research and Development, 2020, 57(5): 1037-1045. DOI: 10.7544/issn1000-1239.2020.20190474
[5]	Li Shengdong, Lü Xueqiang. Static Restart Stochastic Gradient Descent Algorithm Based on Image Question Answering[J]. Journal of Computer Research and Development, 2019, 56(5): 1092-1100. DOI: 10.7544/issn1000-1239.2019.20180472
[6]	Wang Yilei, Zhuo Yifan, Wu Yingjie, Chen Mingqin. Question Answering Algorithm on Image Fragmentation Information Based on Deep Neural Network[J]. Journal of Computer Research and Development, 2018, 55(12): 2600-2610. DOI: 10.7544/issn1000-1239.2018.20180606
[7]	Yu Jun, Wang Liang, Yu Zhou. Research on Visual Question Answering Techniques[J]. Journal of Computer Research and Development, 2018, 55(9): 1946-1958. DOI: 10.7544/issn1000-1239.2018.20180168
[8]	Han Zhao, Miao Duoqian, Ren Fuji, Zhang Hongyun. Rough Set Knowledge Discovery Based Open Domain Chinese Question Answering Retrieval[J]. Journal of Computer Research and Development, 2018, 55(5): 958-967. DOI: 10.7544/issn1000-1239.2018.20170232
[9]	Jiang Shuqiang, Min Weiqing, Wang Shuhui. Survey and Prospect of Intelligent Interaction-Oriented Image Recognition Techniques[J]. Journal of Computer Research and Development, 2016, 53(1): 113-122. DOI: 10.7544/issn1000-1239.2016.20150689
[10]	Hou Yongshuai, Zhang Yaoyun, Wang Xiaolong, Chen Qingcai, Wang Yuliang, and Hu Baotian. Recognition and Retrieval of Time-sensitive Question in Chinese QA System[J]. Journal of Computer Research and Development, 2013, 50(12): 2612-2620.