Review of Knowledge Base Question Answering Based on Information Retrieval

Tian Xuan; Wu Zhichao

doi:10.7544/issn1000-1239.202331013

Journal of Computer Research and Development > 2025 > 62(2): 314-335. > DOI: 10.7544/issn1000-1239.202331013 CSTR: 32373.14.issn1000-1239.202331013

Tian Xuan, Wu Zhichao. Review of Knowledge Base Question Answering Based on Information Retrieval[J]. Journal of Computer Research and Development, 2025, 62(2): 314-335. DOI: 10.7544/issn1000-1239.202331013

Citation:

PDF (2689 KB)

Review of Knowledge Base Question Answering Based on Information Retrieval

School of Information Science and Technology, Beijing Forestry University, Beijing 100083
Engineering Research Center for Forestry-oriented Intelligent Information Processing of National Forestry and Grassland Administration，Beijing，100083

More Information

Author Bio:
Tian Xuan: born in 1976. PhD, associate professor. Senior member of CCF. Her main research interests include intelligent information processing and text mining

Wu Zhichao: born in 1999. Master candidate. Student member of CCF. His main research interests include knowledge base question answering and large language models
Received Date: December 13, 2023
Revised Date: June 18, 2024
Accepted Date: August 08, 2024
Available Online: August 13, 2024

Graphical Abstract

Abstract

Abstract

Knowledge base question answering is aimed to retrieval relevant information from the knowledge base for model inference, and return accurate answers. In recent years, with the development of deep learning and large language models, knowledge base question answering based on information retrieval has become the research focus, and many novel research methods have emerged. We summarize and analyze the methods of knowledge base question answering based on information retrieval from different aspects such as model methods and datasets. Firstly, we introduce the research significance and related definitions of knowledge base question answering. Then, according to the model processing stages, we explain the key problems and typical solutions faced in each stage from four stages: question parsing, information retrieval, model inference, and answer generation, and summarize the common network modules used in each stage. Then we analyze and sort out the inexplicability of knowledge base question answering based on information retrieval methods. In addition, relevant datasets with different characteristics and baseline models at different stages are classified and summarized. Finally, the summary and outlook are provided on each stage of knowledge base question answering based on information retrieval, as well as the overall development direction of the field.
- knowledge base question answering,
- information retrieval,
- deep learning,
- large language models,
- stage issues

FullText(HTML)

References (76)

References

[1]	Lan Yunshi, He Gaole, Jiang Jinhao, et al. A survey on complex knowledge base question answering: Methods, challenges and solutions[C]//Proc of the 30th Int Joint Conf on Artificial Intelligence. Freiburg: IJCAI, 2021: 4483−4491
[2]	Xie Zhiwen, Zeng Zhao, Zhou Guangyou, et al. Topic enhanced deep structured semantic models for knowledge base question answering[J]. Science China Information Sciences, 2017, 60(11): 110103: 1−110103
[3]	Zheng H S, Mishra S, Chen Xinyun, et al. Take a step back: Evoking reasoning via abstraction in large language models[J]. arXiv preprint, arXiv: 2310.06117, 2023
[4]	Li Junzhuo, Xiong Deyi. KaFSP: Knowledge-aware fuzzy semantic parsing for conversational question answering over a large-scale knowledge base[C]//Proc of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Stroudsburg, PA: ACL, 2022: 461−473
[5]	Wang Liang, Yang Nan, Wei Furu. Query2doc: Query expansion with large language models[C]//Proc of the 2023 Conf on Empirical Methods in Natural Language Processing. Stroudsburg, PA: ACL, 2023: 9414−9423
[6]	Zhang Yuyu, Dai Hanjun, Kozareva Z, et al. Variational reasoning for question answering with knowledge graph[C]//Proc of the 32nd AAAI Conf on Artificial Intelligence. Palo Alto, CA: AAAI, 2018: 6069−6076
[7]	Yu Xiaomei, Feng Wenzhi, Wang Hong, et al. An attention mechanism and multi-granularity-based Bi-LSTM model for Chinese Q&A system[J]. Soft Computing, 2020, 24(8): 5831−5845 doi: 10.1007/s00500-019-04367-8
[8]	Press O, Zhang M, Min S, et al. Measuring and narrowing the compositionality gap in language models[J]. arXiv preprint, arXiv: 2210.03350, 2023
[9]	Haji S, Suekane K, Sano H, et al. Exploratory inference chain: Exploratorily chaining multi-hop inferences with large language models for question-answering[C]//Proc of the 17th Int Conf on Semantic Computing (ICSC). Piscataway, NJ: IEEE, 2023: 175−182
[10]	Jia Zhen, Abujabal A, Saha R R, et al. TEQUILA: Temporal question answering over knowledge bases[C]//Proc of the 27th ACM Int Conf on Information and Knowledge Management. New York: ACM, 2018: 1807−1810
[11]	Jia Zhen, Pramanik S, Saha R R, et al. Complex temporal question answering on knowledge graphs[C]//Proc of the 30th ACM Int Conf on Information & Knowledge Management. New York: ACM, 2021: 792−802
[12]	Shang Chao, Wang Guangtao, Qi Peng, et al. Improving time sensitivity for question answering over temporal knowledge graphs[C]//Proc of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Stroudsburg, PA: ACL, 2022: 8017−8026
[13]	Yao Junping, Wang Yijing, Li Xiaojun, et al. TERQA: Question answering over knowledge graph considering precise dependencies of temporal information on vectors[J]. Displays, 2022, 74: 102269 doi: 10.1016/j.displa.2022.102269
[14]	Jiao Songlin, Zhu Zhenfang, Wu Wenqing, et al. An improving reasoning network for complex question answering over temporal knowledge graphs[J]. Applied Intelligence, 2023, 53(7): 8195−8208 doi: 10.1007/s10489-022-03913-6
[15]	Wang Meiling, Li Min, Sun Kewei, et al. Entity difference modeling based entity linking for question answering over knowledge graphs[C]//Proc of the CCF Int Conf on Natural Language Processing and Chinese Computing. Berlin: Springer, 2022: 221−233
[16]	Luo Da, Su Jindian, Yu Shanshan. A BERT-based approach with relation-aware attention for knowledge base question answering[C/OL]//Proc of the 2020 Int Joint Conf on Neural Networks (IJCNN). Piscataway, NJ: IEEE, 2020[2024-06-17]. https://ieeexplore-ieee-org-s.libyc.nudt.edu.cn/document/9207186/citations?tabFilter=papers#citations
[17]	Cao Yong, Li Xianzhi, Liu Huiwen, et al. Pay more attention to relation exploration for knowledge base question answering[C]//Findings of the Association for Computational Linguistics (ACL 2023). Stroudsburg, PA: ACL, 2023: 2119−2136
[18]	Shamsabadi A S, Ramezani R, Farsani H K, et al. Direct relation detection for knowledge-based question answering[J]. Expert Systems with Applications, 2023, 211: 118678 doi: 10.1016/j.eswa.2022.118678
[19]	Hao Zhifeng, Wu Biao, Wen Wen, et al. A subgraph-representation-based method for answering complex questions over knowledge bases[J]. Neural Networks, 2019, 119: 57−65 doi: 10.1016/j.neunet.2019.07.014
[20]	Christmann P, Saha R R, Weikum G. Beyond NED: Fast and effective search space reduction for complex question answering over knowledge bases[C]//Proc of the 15th ACM Int Conf on Web Search and Data Mining. New York: ACM, 2022: 172−180
[21]	Chen Yu, Wu Lingfei, Zaki M J. Bidirectional attentive memory networks for question answering over knowledge bases[C]//Proc of the 2019 Conf of the North. Stroudsburg, PA: ACL, 2019: 2913−2923
[22]	Cai Jianyu, Zhang Zhanqiu, Wu Feng, et al. Deep cognitive reasoning network for multi-hop question answering over knowledge graphs[C]//Findings of the Association for Computational Linguistics(ACL-IJCNLP 2021). Stroudsburg, PA: ACL, 2021: 219−229
[23]	Luo Dan, Sheng Jiawei, Xu Hongbo, et al. Improving complex knowledge base question answering with relation-aware subgraph retrieval and reasoning network[C/OL]//Proc of the 30th Int Joint Conf on Neural Networks (IJCNN). Piscataway, NJ: IEEE, 2023[2024-06-17]. https://ieeexplore-ieee-org-s.libyc.nudt.edu.cn/search/searchresult.jsp?newsearch=true&queryText=Improving%20complex%20knowledge%20base%20question%20answering%20with%20relation-aware%20subgraph%20retrieval%20and%20reasoning%20network
[24]	Wang Yile, Li Peng, Sun Maosong, et al. Self-knowledge guided retrieval augmentation for large language models[J]. arXiv preprint, arXiv: 2310.05002, 2023
[25]	Yu Wenhao, Zhang Hongming, Pan Xiaoman, et al. Chain-of-note: Enhancing robustness in retrieval-augmented language models[J]. arXiv preprint, arXiv: 2311.09210, 2023
[26]	Sun Haitian, Dhingra B, Zaheer M, et al. Open domain question answering using early fusion of knowledge bases and text[C]//Proc of the 2018 Conf on Empirical Methods in Natural Language Processing. Stroudsburg, PA: ACL, 2018: 4231−4242
[27]	Sun Haitian, Bedrax-Weiss T, Cohen W. PullNet: Open domain question answering with iterative retrieval on knowledge bases and text[C]//Proc of the 2019 Conf on Empirical Methods in Natural Language Processing and the 9th Int Joint Conf on Natural Language Processing (EMNLP-IJCNLP). Stroudsburg, PA: ACL, 2019: 2380−2390
[28]	Zhang Ying, Meng Fandong, Zhang Jinchao, et al. MKGN: A multi-dimensional knowledge enhanced graph network for multi-hop question and answering[J]. IEICE Transactions on Information and Systems, 2022, 105(4): 807−819
[29]	Saxena A, Tripathi A, Talukdar P. Improving multi-hop question answering over knowledge graphs using knowledge base embeddings[C]//Proc of the 58th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, PA: ACL, 2020: 4498−4507
[30]	Shi Xiujin, Hu Jun, Sun Naiwen, et al. TrEKBQA: Traversing knowledge graph embedding for multi-hop knowledge base question answering[J]. Journal of Physics: Conf Series, 2023, 2424(1): 012−027
[31]	Zan Daoguang, Wang Sirui, Zhang Hongzhi, et al. Complex question answering over incomplete knowledge graph as n-ary link prediction[C/OL]//Proc of the 29th Int Joint Conf on Neural Networks (IJCNN). Piscataway, NJ: IEEE, 2022[2024-06-17]. https://ieeexplore-ieee-org-s.libyc.nudt.edu.cn/search/searchresult.jsp?newsearch=true&queryText=Complex%20question%20answering%20over%20incomplete%20knowledge%20graph%20as%20n-ary%20link%20prediction
[32]	Zhao Fen, Li Yinguo, Hou Jie, et al. Improving question answering over incomplete knowledge graphs with relation prediction[J]. Neural Computing and Applications, 2022, 34(8): 6331−6348 doi: 10.1007/s00521-021-06736-7
[33]	Zheng Chen, Kordjamshidi P. Dynamic relevance graph network for knowledge-aware question answering[C]//Proc of the 29th Int Conf on Computational Linguistics. Southampton, UK: ICCL, 2022: 1357−1366
[34]	Cui Hai, Peng Tao, Han Ridong, et al. Reinforcement learning with dynamic completion for answering multi-hop questions over incomplete knowledge graph[J]. Information Processing & Management, 2023, 60(3): 103283
[35]	Yasunaga M, Ren Hongyu, Bosselut A, et al. QA-GNN: Reasoning with language models and knowledge graphs for question answering[C]//Proc of the 2021 Conf of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Stroudsburg, PA: ACL, 2021: 535−546
[36]	Sun Yueqing, Shi Qi, Qi Le, et al. JointLK: Joint reasoning with language models and knowledge graphs for commonsense question answering[C]//Proc of the 2022 Conf of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Stroudsburg, PA: ACL, 2022: 5049−5060
[37]	Cao Xing, Liu Yun. ReLMKG: Reasoning with pre-trained language models and knowledge graphs for complex question answering[J]. Applied Intelligence, 2023, 53(10): 12032−12046 doi: 10.1007/s10489-022-04123-w
[38]	Wang Xu, Zhao Shuai, Cheng Bo, et al. Explore modeling relation information and direction information in KBQA[J]. Neurocomputing, 2022, 471: 139−148 doi: 10.1016/j.neucom.2021.10.094
[39]	Feng Zhangyin, Feng Xiaocheng, Zhao Dezhi, et al. Retrieval-generation synergy augmented large language models[J]. arXiv preprint, arXiv: 2310.05149, 2023
[40]	Hu Nan, Bi Sheng, Qi Guilin, et al. Improving core path reasoning for the weakly supervised knowledge base question answering[C]//Proc of the Int Conf on Database Systems for Advanced Applications. Berlin: Springer, 2022: 162−170
[41]	Niu Guanglin, Li Yang, Tang Chengguang, et al. Path-enhanced multi-relational question answering with knowledge graph embeddings[J]. arXiv preprint, arXiv: 2110.15622, 2021
[42]	Bi Xin, Nie Haojie, Zhang Guoliang, et al. Boosting question answering over knowledge graph with reward integration and policy evaluation under weak supervision[J]. Information Processing & Management, 2023, 60(2): 103242
[43]	Yoran O, Wolfson T, Bogin B, et al. Answering questions by meta-reasoning over multiple chains of thought[J]. arXiv preprint, arXiv: 2304.13007, 2023
[44]	Park J, Patel A, Khan O Z, et al. Graph-guided reasoning for multi-hop question answering in large language models[J]. arXiv preprint, arXiv: 2311.09762, 2023
[45]	Zhou Mantong, Huang Minlie, Zhu Xiaoyan. An interpretable reasoning network for multi-relation question answering[C]//Proc of the 27th Int Conf on Computational Linguistics. Stroudsburg, PA: ACL, 2018: 2010−2022
[46]	Li Xinmeng, Alazab M, Li Qian, et al. Question-aware memory network for multi-hop question answering in human-robot interaction[J]. Complex & Intelligent Systems, 2022, 8(2): 851−861
[47]	Cui Hai, Peng Tao, Bao Tie, et al. Stepwise relation prediction with dynamic reasoning network for multi-hop knowledge graph question answering[J]. Applied Intelligence, 2023, 53(10): 12340−12354 doi: 10.1007/s10489-022-04127-6
[48]	Du Haowei, Huang Quzhe, Zhang Chen, et al. Knowledge-enhanced iterative instruction generation and reasoning for knowledge base question answering[C]//Proc of the CCF Int Conf on Natural Language Processing and Chinese Computin. Berlin: Springer, 2022: 431−444
[49]	Hao Yanchao, Zhang Yuanzhe, Liu Kang, et al. An end-to-end model for question answering over knowledge base with cross-attention combining global knowledge[C]//Proc of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Stroudsburg, PA: ACL, 2017: 221−231
[50]	Wu Wenqing, Zhu Zhenfang, Qi Jiangtao, et al. A dynamic graph expansion network for multi-hop knowledge base question answering[J]. Neurocomputing, 2023, 515: 37−47 doi: 10.1016/j.neucom.2022.10.023
[51]	Jiang Zhengbao, Xu F, Gao Luyu, et al. Active retrieval augmented generation[J]. arXiv preprint, arXiv: 2305.06983, 2023
[52]	Trivedi H, Balasubramanian N, Khot T, et al. Interleaving retrieval with chain-of-thought reasoning for knowledge-intensive multi-step questions[J]. arXiv preprint, arXiv: 2212.10509, 2023
[53]	Asai A, Wu Zeqiu, Wang Yizhong, et al. Self-RAG: Learning to retrieve, generate, and critique through self-reflection[J]. arXiv preprint, arXiv: 2310.11511, 2023
[54]	Xiong Wenhan, Yu Mo, Chang Shiyu, et al. Improving question answering over incomplete KBs with knowledge-aware reader[C]//Proc of the 57th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, PA: ACL, 2019: 4258−4264
[55]	Yu Wenhao, Iter Dan, Wang Shuohang, et al. Generate rather than retrieve: Large language models are strong context generators[J]. arXiv preprint, arXiv: 2209.10063, 2023
[56]	Thai D, Ravishankar S, Abdelaziz I, et al. CBR-iKB: A case-based reasoning approach for question answering over incomplete knowledge bases[J]. arXiv preprint, arXiv: 2204.08554, 2022
[57]	Mei Shijie, Hu Xinrong, Yao Xun, et al. Seeing the wood for the trees: A contrastive regularization method for the low-resource knowledge base question answering[C]//Findings of the Association for Computational Linguistics (NAACL 2022). Stroudsburg, PA: ACL, 2022: 1085−1094
[58]	Dong Li, Wei Furu, Zhou Ming, et al. Question answering over freebase with multi-column convolutional neural networks[C]//Proc of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th Int Joint Conf on Natural Language Processing (Volume 1: Long Papers). Stroudsburg, PA: ACL, 2015: 260−269
[59]	He Gaole, Lan Yunshi, Jiang Jing, et al. Improving multi-hop knowledge base question answering by learning intermediate supervision signals[C]//Proc of the 14th ACM Int Conf on Web Search and Data Mining. New York: ACM, 2021: 553−561
[60]	Wang Yu, Jin Hongxia. A new concept of knowledge based question answering (KBQA) system for multi-hop reasoning[C]//Proc of the 2022 Conf of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Stroudsburg, PA: ACL, 2022: 4007−4017
[61]	Li Xiaonan, Zhu Changtai, Li Linyang, et al. LLatrieval: LLM-verified retrieval for verifiable generation[J]. arXiv preprint, arXiv: 2311.07838, 2023
[62]	Li Sirui, Wong KW, Fung C C, et al. Improving question answering over knowledge graphs using graph summarization[C]//Proc of the 28th Int Conf on Neural Information Proc (ICONIP 2021). Berlin: Springer, 2021: 489−500
[63]	Shi Jiaxin, Cao Shulin, Hou Lei, et al. TransferNet: An effective and transparent framework for multi-hop question answering over relation graph[C]//Proc of the 2021 Conf on Empirical Methods in Natural Language Processing. Stroudsburg, PA: ACL, 2021: 4149−4158
[64]	Feng Yanlin, Chen Xinyue, Lin Bill Yuchen, et al. Scalable multi-hop relational reasoning for knowledge-aware question answering[C]//Proc of the 2020 Conf on Empirical Methods in Natural Language Processing (EMNLP). Stroudsburg, PA: ACL, 2020: 1295−1309
[65]	Qiu Yunqi, Wang Yuanzhuo, Jin Xiaolong, et al. Stepwise reasoning for multi-relation question answering over knowledge graph with weak supervision[C]//Proc of the 13th Int Conf on Web Search and Data Mining. Stroudsburg, PA: ACL, 2020: 474−482
[66]	Zhang Qixuan, Weng Xinyi, Zhou Guangyou, et al. ARL: An adaptive reinforcement learning framework for complex question answering over knowledge base[J]. Information Processing & Management, 2022, 59(3): 102933
[67]	Xiang Yuxuan, Wu Jiajun, Wang Tiexin, et al. Reasoning path generation for answering multi-hop questions over knowledge graph[C]//Proc of the Asia-Pacific Web (APWeb) and Web-Age Information Management (WAIM) Joint Int Conf on Web and Big Data. Berlin: Springer, 2022: 195−209
[68]	Bordes A, Usunier N, Chopra S, et al. Large-scale simple question answering with memory networks[J]. arXiv preprint, arXiv: 1506.02075, 2015
[69]	Berant J, Chou A, Frostig R, et al. Semantic parsing on freebase from question-answer pairs[C]//Proc of the 2013 Conf on Empirical Methods in Natural Language Processing. Stroudsburg, PA: ACL, 2013: 1533−1544
[70]	Talmor A, Berant J. The Web as a knowledge-base for answering complex questions[C]//Proc of the 2018 Conf of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers). Stroudsburg, PA: ACL, 2018: 641−651
[71]	Jia Zhen, Abujabal A, Saha R R, et al. TempQuestions: A benchmark for temporal question answering[C]//Proc of the Web Conf 2018 (WWW 18). New York: ACM, 2018: 1057−1062
[72]	Saxena A, Chakrabarti S, Talukdar P. Question answering over temporal knowledge graphs[C]//Proc of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th Int Joint Conf on Natural Language Processing (Volume 1: Long Papers). Stroudsburg, PA: ACL, 2021: 6663−6676
[73]	Hu Nan, Wu Yike, Qi Guilin, et al. An empirical study of pre-trained language models in simple knowledge graph question answering[J]. World Wide Web, 2023, 26(5): 2855−2886 doi: 10.1007/s11280-023-01166-y
[74]	Jiang Lei, Meng Zuqiang. Knowledge-based visual question answering using multi-modal semantic graph[J]. Electronics, 2023, 12(6): 1390 doi: 10.3390/electronics12061390
[75]	Yang Zhenyu, Wu Lei, Wen Peian, et al. Visual question answering reasoning with external knowledge based on bimodal graph neural network[J]. Electronic Research Archive, 2023, 31(4): 1948−1965 doi: 10.3934/era.2023100
[76]	张莹莹,钱胜胜,方全,等. 基于多模态知识感知注意力机制的问答方法[J]. 计算机研究与发展,2020,57(5):1037−1045 doi: 10.7544/issn1000-1239.2020.20190474 Zhang Yingying, Qian Shengsheng, Fang Quan, et al. A question answering method based on multimodal knowledge perception attention mechanism[J]. Journal of Computer Research and Development, 2020, 57(5): 1037−1045 (in Chinese) doi: 10.7544/issn1000-1239.2020.20190474