Abstract:
Information systems usually use data management systems to manage data, among which SQL has been the mainstream query language for data management because of its ease of use and flexibility, and users can write SQL statements and submit them to the data management system to get query results. The efficiency of the execution model determines whether the system can quickly respond to user queries. The existing execution models mainly adopt interpreted execution and compiled execution. Interpreted execution is used by most systems due to its scalability and maintainability. Unlike interpreted execution, compiled execution generates efficient custom code to speed up queries that should have been processed by interpreted execution, and the significant performance gains have attracted a number of database systems to implement the technology. However, generating the corresponding custom code for a query is a complex process that requires a number of considerations, even in some cases, the performance of using compiled execution may not be as good as the traditional volcano model. We provide a systematic review of the progress of compiled execution techniques from conceptual and technical perspectives. Firstly, we outline the basic concepts of query compilation and introduce the relevant terminology and background knowledge. Secondly, we introduce the relevant techniques from three perspectives: intermediate code generation, intermediate representation, machine code generation and running. Finally, we look at the future development direction of compiled execution technology in the context of current research trends in data management systems and recent research work.