基于学习索引的图式区块链高效可验证查询机制

常健; 林立成; 李彬弘; 肖江; 金海

doi:10.7544/issn1000-1239.202330272

摘要: 区块链技术近年来受到了广泛关注，并应用于各个领域，数据查询是其在应用过程的一个重要技术，如物流链中的数据溯源等. 随着区块链系统中交易数据量的持续增长，支持高并发事务处理的图式区块链成为区块链技术的研究热点. 图式区块链的高并发区块使得数据查询难以像传统链式结构依次遍历，可以根据图式结构采用广度优先或深度优先遍历策略，但这种查询方式存在效率低、验证难等问题. 针对图式区块链数据查询的效率和可验证性问题，提出了一种基于学习索引的高效可验证的图式区块链查询机制Lever. 该机制通过引入学习索引技术对图式区块链中时序数据分布特征进行学习以实现对索引过程的优化，旨在提高图式区块链查询的效率和可验证性. 学习索引是通过学习数据分布来减少索引存储空间和查询时间的新型索引技术，将学习索引应用于图式区块链的纪元高度与时间戳的映射关系中，通过函数运算的方式定位查询数据，提高查询速度和效率. 同时，为了加快纪元内多个区块数据的过滤速度，在每个区块头部添加布隆过滤器，并为每个纪元生成一个聚合布隆过滤器，从而提高纪元内的数据遍历速度. 此外，为保证查询结果的正确性和完整性，该机制结合布隆过滤器和排序默克尔树生成可验证对象，通过部分默克尔树分支实现对布隆过滤器假阳性的不存在证明，有效减小验证对象的规模，从而提高图式区块链查询过程的数据传输效率. 实验结果表明，Lever能有效提高基于DAG的图式区块链查询效率和可验证性，与Conflux的基本查询机制相比，该机制的查询性能最高提升了10倍，可验证对象大小开销可以降低90%.

Abstract: Blockchain system has been broadly applied and studied in the last few years with query being essential to facilitate a wide range of blockchain application scenarios such as data provenance in the logistics chain. As blockchain data continues to grow, the efficiency and verifiability of data query have become hot issues in blockchain research. The high concurrent blocks of the DAG-based blockchain make it difficult for queries to traverse sequentially like the traditional chain structure. Breadth-first or depth-first query strategies can be adopted according to the DAG structure, but this method has the problems of low efficiency and poor verification. We propose an efficient and verifiable query mechanism, called Lever, for DAG-based blockchains based on the learned index. The mechanism aims to improve the efficiency and verifiability of blockchain query by introducing learned index technology with the data distribution characteristic to optimize the indexing process in the blockchain. Learned index is a novel indexing technology that reduces the storage space and query time of indexes by learning the distribution of data. In this paper, learned index is applied to the mapping between epoch height and timestamp of the DAG-based blockchain, greatly improving query speed and efficiency. At the same time, in order to speed up the data filtering of multiple blocks in an epoch, a Bloom filter is added to the block header, and an aggregated Bloom filter is produced by miner for each epoch. In addition, to guarantee the authentication (i.e., correctness and completeness) of query results, the mechanism introduces a sorted Merkle tree to perform proof with the Merkle tree branch by combining the filtering and sorting processes, effectively reducing the size of the verification object (VO), thereby enhancing the verifiability of blockchain query. Experimental results show that Lever can effectively improve the efficiency and verifiability of DAG-based blockchain query, and Lever improves query performance by up to 10 times and reduces VO size overhead by 90% compared with the basic query mechanism of Conflux.

基于学习索引的图式区块链高效可验证查询机制

Efficient and Verifiable Query Mechanism of DAG Blockchain Based on Learned Index