ISSN 1000-1239 CN 11-1777/TP

计算机研究与发展 ›› 2020, Vol. 57 ›› Issue (11): 2404-2418.doi: 10.7544/issn1000-1239.2020.20190564

• 系统结构 • 上一篇    下一篇

SBS: 基于固态盘内部并行性的R-树高效查询算法

陈玉标1,李建中1,李英姝1,2   

  1. 1(哈尔滨工业大学计算机科学与技术学院 哈尔滨 150001);2(佐治亚州立大学计算机科学与技术学院 美国佐治亚州亚特兰大 30303) (chenyubiao@hit.edu.cn)
  • 出版日期: 2020-11-01
  • 基金资助: 
    国家自然科学基金项目(U1811461,61832003,61732003);美国国家科学基金项目(1741277,1829674)

SBS: An Efficient R-Tree Query Algorithm Exploiting the Internal Parallelism of SSDs

Chen Yubiao1, Li Jianzhong1, Li Yingshu1,2   

  1. 1(College of Computer Science and Technology, Harbin Institute of Technology, Harbin 150001);2(College of Computer Science and Technology, Georgia State University, Atlanta, GA, USA 30303)
  • Online: 2020-11-01
  • Supported by: 
    This work was supported by the National Natural Science Foundation of China (U1811461, 61832003, 61732003) and the National Science Foundation of USA (1741277, 1829674).

摘要: 由于闪存固态盘逐渐取代机械硬盘成为主流存储,与此同时,随着闪存固态盘技术的进步,越来越多的存储芯片和硬件资源被植入,使得它拥有丰富的内部并行性,而传统的外存算法和数据结构优化工作往往没有考虑固态盘的内部并行性. 范围查询作为R-树索引的基础操作,它的性能对于地理信息系统非常重要. 但是由于R-树索引父子结点之间加载的依赖问题,使得它很难能够有效地去利用固态盘内部并行性去加速. 因此,为了克服该困难,提出一种基于栈结构的范围查询算法SBS(stack batch search). 它能在有效地利用固态盘内部并行性的同时,最多只需要O(B log N)内存空间. 最后,通过真实数据实验来验证SBS算法的性能. 实验结果表明,SBS在可接受的内存消耗情况下,在2款不同的固态盘上,范围查询的性能加速比可达3.4和4.5.

关键词: R-树, 范围查询, 内部并行性, 固态盘, 加速比

Abstract: The flash-based SSD has become the mainstream storage device for its excellent features. At the same time, with the magnificent improvement of internal design of SSD architecture, more and more storage chips and hardware resources are integrated into SSDs which makes them full of internal parallelism, while traditional external memory algorithm and data structure optimization rarely take the internal parallelism of SSDs into consideration. Range query is one of the most important basic operations of R-tree. R-tree is the engine index data structure of many geographic information systems. Therefore, the efficiency of range query plays an important role in the performance of the entire geographic information system. Almost all the tree index structures are difficult to effectively utilize the feature of internal parallelism due to the data loading dependency problem. Therefore, a new range query algorithm SBS(stack batch search) based on stack is proposed, which can effectively utilize the internal parallelism of the SSD with memory usage of O(B log N). Finally, we verify the performance of the SBS algorithm through real data experiments. Experimental results show that SBS has the best performance in range query under acceptable memory consumption. On two different solid-state drives, the speed up ratio of SBS can reach 3.4 and 4.5 separately.

Key words: R-tree, range query, internal parallelism, solid state drives, speed up

中图分类号: