Advanced Search
    Yu Jun, Wang Liang, Yu Zhou. Research on Visual Question Answering Techniques[J]. Journal of Computer Research and Development, 2018, 55(9): 1946-1958. DOI: 10.7544/issn1000-1239.2018.20180168
    Citation: Yu Jun, Wang Liang, Yu Zhou. Research on Visual Question Answering Techniques[J]. Journal of Computer Research and Development, 2018, 55(9): 1946-1958. DOI: 10.7544/issn1000-1239.2018.20180168

    Research on Visual Question Answering Techniques

    • With the significant advances of deep learning in computer vision and natural language processing, the existing methods are able to accurately understand the semantics of visual contents and natural languages, and carry out research on cross-media data representation and interaction. In recent years, visual question answering (VQA) has become a hot spot in cross-media expression and interaction area. The target of VQA is to learn a model to understand the visual content referred by a natural language question, and answer it automatically. This paper summarizes the research progresses in recent years on VQA from the aspects of concepts, models and datasets, and discusses the shortcomings of the current works. Finally, the possible future directions of VQA are discussed on methodology, applications and platforms.
    • loading

    Catalog

      Turn off MathJax
      Article Contents

      /

      DownLoad:  Full-Size Img  PowerPoint
      Return
      Return