Research on Visual Question Answering Techniques

Yu Jun; Wang Liang; Yu Zhou

doi:10.7544/issn1000-1239.2018.20180168

Journal of Computer Research and Development > 2018 > 55(9): 1946-1958. > DOI: 10.7544/issn1000-1239.2018.20180168 CSTR: 32373.14.issn1000-1239.2018.20180168

Yu Jun, Wang Liang, Yu Zhou. Research on Visual Question Answering Techniques[J]. Journal of Computer Research and Development, 2018, 55(9): 1946-1958. DOI: 10.7544/issn1000-1239.2018.20180168

Citation:

Yu Jun, Wang Liang, Yu Zhou. Research on Visual Question Answering Techniques[J]. Journal of Computer Research and Development, 2018, 55(9): 1946-1958. DOI: 10.7544/issn1000-1239.2018.20180168

Citation:

Yu Jun, Wang Liang, Yu Zhou. Research on Visual Question Answering Techniques[J]. Journal of Computer Research and Development, 2018, 55(9): 1946-1958. DOI: 10.7544/issn1000-1239.2018.20180168

PDF (1925 KB)

Research on Visual Question Answering Techniques

(School of Computer Science and Technology, Hangzhou Dianzi University, Hangzhou 310018) (Key Laboratory of Complex System Modeling and Simulation (Hangzhou Dianzi University), Ministry of Education, Hangzhou 310018)

More Information

Published Date: August 31, 2018

Graphical Abstract

Abstract

Abstract

With the significant advances of deep learning in computer vision and natural language processing, the existing methods are able to accurately understand the semantics of visual contents and natural languages, and carry out research on cross-media data representation and interaction. In recent years, visual question answering (VQA) has become a hot spot in cross-media expression and interaction area. The target of VQA is to learn a model to understand the visual content referred by a natural language question, and answer it automatically. This paper summarizes the research progresses in recent years on VQA from the aspects of concepts, models and datasets, and discusses the shortcomings of the current works. Finally, the possible future directions of VQA are discussed on methodology, applications and platforms.
- visual question answering (VQA),
- visual reasoning,
- video question answering,
- deep learning,
- knowledge network

FullText(HTML)

References (0)

[1]	Du Jinming, Sun Yuanyuan, Lin Hongfei, Yang Liang. Conversational Emotion Recognition Incorporating Knowledge Graph and Curriculum Learning[J]. Journal of Computer Research and Development, 2024, 61(5): 1299-1309. DOI: 10.7544/issn1000-1239.202220951
[2]	Liu Xinghong, Zhou Yi, Zhou Tao, Qin Jie. Self-Paced Learning for Open-Set Domain Adaptation[J]. Journal of Computer Research and Development, 2023, 60(8): 1711-1726. DOI: 10.7544/issn1000-1239.202330210
[3]	Wen Yimin, Yuan Zhe, Yu Hang. A New Semi-Supervised Inductive Transfer Learning Framework: Co-Transfer[J]. Journal of Computer Research and Development, 2023, 60(7): 1603-1614. DOI: 10.7544/issn1000-1239.202220232
[4]	Chen Zhenzhu, Zhou Chunyi, Su Mang, Gao Yansong, Fu Anmin. Research Progress of Secure Outsourced Computing for Machine Learning[J]. Journal of Computer Research and Development, 2023, 60(7): 1450-1466. DOI: 10.7544/issn1000-1239.202220767
[5]	Lu Shaoshuai, Chen Long, Lu Guangyue, Guan Ziyu, Xie Fei. Weakly-Supervised Contrastive Learning Framework for Few-Shot Sentiment Classification Tasks[J]. Journal of Computer Research and Development, 2022, 59(9): 2003-2014. DOI: 10.7544/issn1000-1239.20210699
[6]	Zhuo Junbao, Su Chi, Wang Shuhui, Huang Qingming. Min-Entropy Transfer Adversarial Hashing[J]. Journal of Computer Research and Development, 2020, 57(4): 888-896. DOI: 10.7544/issn1000-1239.2020.20190476
[7]	Feng Wei, Hang Wenlong, Liang Shuang, Liu Xuejun, Wang Hui. Deep Stack Least Square Classifier with Inter-Layer Model Knowledge Transfer[J]. Journal of Computer Research and Development, 2019, 56(12): 2589-2599. DOI: 10.7544/issn1000-1239.2019.20180741
[8]	Wen Yimin, Tang Shiqi, Feng Chao, Gao Kai. Online Transfer Learning for Mining Recurring Concept in Data Stream Classification[J]. Journal of Computer Research and Development, 2016, 53(8): 1781-1791. DOI: 10.7544/issn1000-1239.2016.20160223
[9]	Hong Jiaming, Yin Jian, Huang Yun, Liu Yubao, and Wang Jiahai. TrSVM: A Transfer Learning Algorithm Using Domain Similarity[J]. Journal of Computer Research and Development, 2011, 48(10): 1823-1830.
[10]	Mei Canhua, Zhang Yuhong, Hu Xuegang, and Li Peipei. A Weighted Algorithm of Inductive Transfer Learning Based on Maximum Entropy Model[J]. Journal of Computer Research and Development, 2011, 48(9): 1722-1728.