基于深度学习的数据库自然语言接口综述

潘璇; 徐思涵; 蔡祥睿; 温延龙; 袁晓洁

doi:10.7544/issn1000-1239.2021.20200209

基于深度学习的数据库自然语言接口综述

Survey on Deep Learning Based Natural Language Interface to Database

摘要

摘要: 数据库自然语言接口(natural language interface to database, NLIDB)能够凭借自然语言描述实现数据库查询操作，是促进用户无障碍地与数据库交互的重要工具.因为NLIDB具有较高的应用价值，近年来一直受到学术与商业领域的关注.目前成熟的NLIDB系统大部分基于经典自然语言处理方法，即通过指定的规则实现自然语言查询到结构化查询的转化.但是基于规则的方法仍然存在拓展性不强的缺陷.深度学习方法具有分布式表示和深层次抽象表示等优势，能深入挖掘自然语言中潜在的语义特征.因此近年来在NLIDB中，引入深度学习技术成为了热门的研究方向.针对基于深度学习的NLIDB研究进展进行总结：首先以解码方法为依据，将现有成果归纳为4种类型分别进行分析；然后汇总了7种模型中常用的辅助方法；最后根据目前尚待解决的问题，提出未来仍需关注的研究方向.

Abstract: NLIDB (natural language interface to database) provides a new form to access databases with barrier-free text query, which reduces the burdens for users to learn the SQL (structured query language). Because of its great application value, NLIDB has attracted much attention in the field of scientific research and commercial in recent years. Most of the current mature NLIDB systems are based on classical natural language processing technologies, which depend on rule-based approaches to realize the transformation from natural language questions to SQL. But these approaches have poor ability to generalize. Deep learning models have advantages on distributed and high-level representation learning, which are competent for semantic feature mining in natural language. Therefore, the application of deep learning technology in NLIDB has gradually become a hot research topic nowadays. This paper provides a systematic review of the NLIDB researches based on deep learning in recent years. The main contributions are as follows: firstly, according to the decoding method, we sort out existing deep learning-based NLIDB models into 4 categories, and state the advantage and the weakness respectively; secondly, we summarize 7 common assist techniques in the implementations of the NLIDB models; thirdly, we propose the problems remaining to be solved and put forward the relevant directions for future researches.

HTML全文

参考文献(0)

施引文献

资源附件(0)