数据库索引调优技术综述

赖思超; 吴小莹; 彭煜玮; 彭智勇

doi:10.7544/issn1000-1239.202220931

摘要: 索引调优是数据库调优的重要组成部分，一直受到广泛关注. 由于索引调优问题的理论复杂性和大数据时代的到来，通过DBA手动调优的方案已经无法满足现代数据库的发展需求，调优方案逐渐开始向自动化、智能化的方向发展. 随着机器学习技术的发展，越来越多的索引选择方案开始引入机器学习技术，并取得了一定的研究成果. 将索引调优问题的解决方案归结为一种基于搜索的调优范式，归纳了其研究内容，阐述了其面临的挑战，对调优范式内的索引配置空间的生成、索引配置的评价以及索引配置的枚举与搜索3方面的研究成果进行了归纳、总结和对比. 对动态工作负载下的索引选择问题（index selection problem，ISP）所面临的新挑战进行了分析，并基于在线反馈控制回路框架对其解决方案进行梳理. 讨论了索引调优工具的发展与现状，通过对现有研究的分析论述，为后来研究者提供参考和研究思路，并对索引选择方案的未来进行了展望.

Abstract: Index tuning is an important problem in database performance tuning and has been studied consistently by worldwide researchers. Due to the theoretical complexity of index tuning as well as the advent of the big data era, manual tuning by DBA is no longer feasible for modern database systems, hence automated and intelligent solutions have been developed. With the development of machine learning techniques, more and more index tuning solutions have integrated with machine learning techniques for better performance and significant progress has been made recently. In this survey, we formulate the problem of index tuning under a search-based paradigm, and under this context, we analyze the main tasks and challenges of this problem. We categorize relevant studies into three main components of the search-based paradigm, namely the generation of the index configurations’ search space, the evaluation of specific index configurations, and the enumeration or the search of index configurations. Then we discuss and compare the related work in each category. We further identify and analyze new challenges for the online index tuning problem where the workload is ad hoc, dynamic, and shifting. We summarize the existing solutions under the online feedback control loop framework. Finally, we discuss the state-of-the-art index tuning tools. Hopefully, with the thorough discussion and evaluation of current research, this survey can provide insights to interested researchers and conclude with future research directions for index tuning.

数据库索引调优技术综述

Survey on Database Index Tuning Techniques