• 中国精品科技期刊
  • CCF推荐A类中文期刊
  • 计算领域高质量科技期刊T1类
Advanced Search
Li Zhixin, Wei Haiyang, Zhang Canlong, Ma Huifang, Shi Zhongzhi. Research Progress on Image Captioning[J]. Journal of Computer Research and Development, 2021, 58(9): 1951-1974. DOI: 10.7544/issn1000-1239.2021.20200281
Citation: Li Zhixin, Wei Haiyang, Zhang Canlong, Ma Huifang, Shi Zhongzhi. Research Progress on Image Captioning[J]. Journal of Computer Research and Development, 2021, 58(9): 1951-1974. DOI: 10.7544/issn1000-1239.2021.20200281

Research Progress on Image Captioning

Funds: This work was supported by the National Natural Science Foundation of China (61966004, 61663004, 61866004, 61762078) and the Guangxi Natural Science Foundation (2019GXNSFDA245018, 2018GXNSFDA281009, 2017GXNSFAA198365).
More Information
  • Published Date: August 31, 2021
  • Image captioning combines the two research fields of computer vision and natural language processing. It requires not only complete image semantic understanding, but also complex natural language expression. It is a crucial task for further research on visual intelligence in line with human perception. This paper reviews the research progress on image captioning. Firstly, five key technologies involved in current deep learning based image captioning methods are summarized and analyzed, including overall architecture, learning strategy, feature mapping, language model and attention mechanism. Then, according to the development process, the existing image captioning methods are divided into four categories, i.e. template based methods, retrieval based methods, encoder-decoder architecture based methods and compositional architecture based methods. We describe the basic concepts, representative methods and research status of each category. Furthermore, we emphatically discuss the various methods based on encoder-decoder architecture and their innovative ideas, such as multimodal space, visual space, semantic space, attention mechanism, model optimization, and so on. Subsequently, from the experimental point of view, we show the common benchmark datasets and evaluation measures in the field of image captioning. In addition, we compare the performance of some typical methods on two benchmark datasets. Finally, based on improving the accuracy, integrity, novelty and diversity of image caption, several future development trends of image captioning are presented.
  • Related Articles

    [1]Zheng Fang, Shen Li, Li Hongliang, Xie Xianghui. Lightweight Error Recovery Techniques of Many-Core Processor in High Performance Computing[J]. Journal of Computer Research and Development, 2015, 52(6): 1316-1328. DOI: 10.7544/issn1000-1239.2015.20150119
    [2]Xiong Huanliang, Zeng Guosun, Wu Canghai. A Novel Scalability Metric for Parallel Computing[J]. Journal of Computer Research and Development, 2014, 51(11): 2547-2558. DOI: 10.7544/issn1000-1239.2014.20130750
    [3]Zhang Aiqing, Mo Zeyao, Yang Zhang. Three-Level Hierarchical Software Architecture for Data-Driven Parallel Computing with Applications[J]. Journal of Computer Research and Development, 2014, 51(11): 2538-2546. DOI: 10.7544/issn1000-1239.2014.20131241
    [4]Chen Qi, Chen Zuoning, Jiang Jinhu. MDDS: A Method to Improve the Metadata Performance of Parallel File System for HPC[J]. Journal of Computer Research and Development, 2014, 51(8): 1663-1670. DOI: 10.7544/issn1000-1239.2014.20121094
    [5]Cai Yong, Li Guangyao, and Wang Hu. Parallel Computing of Central Difference Explicit Finite Element Based on GPU General Computing Platform[J]. Journal of Computer Research and Development, 2013, 50(2): 412-419.
    [6]Zhang Shihui, Kong Lingfu, and Feng Liang. An Improved Hestenes SVD Method and Its Parallel Computing and Application in Parallel Robot[J]. Journal of Computer Research and Development, 2008, 45(4): 716-724.
    [7]Tu Bibo, Hong Xuehai, Zhan Jianfeng, Fan Jianping. Workflow-Based User Environment for High Performance Computing[J]. Journal of Computer Research and Development, 2007, 44(10): 1717-1723.
    [8]Wu Xiangjun, Jin Zhiyan, Chen Dehui, Song Junqiang, Yang Xuesheng. A Parallel Computing Algorithm and Its Application in New Generation of Numerical Weather Prediction System (GRAPES)[J]. Journal of Computer Research and Development, 2007, 44(3).
    [9]Liu Jie, Chi Lihua, Hu Qingfeng, Li Xiaomei. An Improved TFQMR Algorithm for Large Linear Systems Suited to Parallel Computing[J]. Journal of Computer Research and Development, 2005, 42(7): 1235-1240.
    [10]Feng Shengzhong, Tan Guangming, Xu Lin, Sun Ninghui, Xu Zhiwei. Research on the High Performance Algorithms of Dawning 4000H Bioinformatics Specific Machine[J]. Journal of Computer Research and Development, 2005, 42(6): 1053-1058.

Catalog

    Article views (940) PDF downloads (435) Cited by()

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return