Citation: | Liu Yang, Jin Peiquan. ZB+-tree: A Novel ZNS SSD-Aware Index Structure[J]. Journal of Computer Research and Development, 2023, 60(3): 509-524. DOI: 10.7544/issn1000-1239.202220502 |
ZNS SSD is a newly emerging solid state drive (SSD). It supports zone-based storage and access of data in SSD. Compared with traditional SSDs, ZNS SSD can effectively improve the read-write throughput of SSD, reduce write amplification and over-provisioning in SSD. However, ZNS SSD requires that data must be written to zones sequentially, and tasks such as space allocation and garbage collection need to be controlled by users. These characteristics of ZNS SSD pose new challenges to many components in traditional database systems, such as storage management, indexing, and buffer management. This paper mainly studies how to adapt the traditional B+-tree index structure to ZNS SSD. We propose a new ZNS SSD-aware index structure called ZB+-tree (ZNS-aware B+-tree), which is the first ZNS SSD-aware index so far. ZB+-tree takes B+-tree as the base structure, and utilizes the two kinds of zones inside ZNS SSD, namely conventional zones supporting a few random writes and sequential zones only supporting sequential writes. In particular, it uses conventional zones to absorb random writes to ZNS SSD and stores the index nodes in conventional and sequential zones separately. We design different node structures for the nodes in the two zones. By using different types of nodes on the two zones, ZB+-tree can absorb random writes on the index and ensure the sequential write requirements on sequential zones. We conduct experiments by simulating ZNS SSD using null_blk and libzbd. Also, we modify the existing copy-on-write (CoW) B+-tree as the competitor. The results show that ZB+-tree outperforms CoW B+-tree in various metrics including running time and space efficiency.
[1] |
Bjørling M, Aghayev A, Holmberg H, et al. ZNS: Avoiding the block interface tax for flash-based SSDs [C] //Proc of 2021 USENIX Annual Technical Conf (USENIX ATC 21). Berkeley, CA: USENIX Association, 2021: 689−703
|
[2] |
Han K, Gwak H, Shin D, et al. ZNS+: Advanced zoned namespace interface for supporting in-storage zone compaction [C] //Proc of the 15th USENIX Symp on Operating Systems Design and Implementation (OSDI 21). Berkeley, CA: USENIX Association, 2021: 147−162
|
[3] |
Stavrinos T, Berger D S, Katz-Bassett E, et al. Don't be a blockhead: Zoned namespaces make work on conventional SSDs obsolete[C] //Proc of the Workshop on Hot Topics in Operating Systems. New York: ACM, 2021: 144−151
|
[4] |
Maheshwari U. From blocks to rocks: A natural extension of zoned namespaces[C] //Proc of the 13th ACM Workshop on Hot Topics in Storage and File Systems. New York: ACM, 2021: 21−27
|
[5] |
Choi G, Lee K, Oh M, et al. A new LSM-style garbage collection scheme for ZNS SSDs [C] //Proc of the 12th USENIX Workshop on Hot Topics in Storage and File Systems (HotStorage 20). Berkeley, CA: USENIX Association, 2020: Article 1
|
[6] |
Purandare D R, Wilcox P, Litz H, et al. Append is near: Log-based data management on ZNS SSDs[C/OL] //Proc of the 12th Annual Conf on Innovative Data Systems Research (CIDR’22). 2022[2022-08-20].https://www.ssrc.ucsc.edu/media/pubs/8698b15f3152427d1285a995af615fbe7be26c7b.pdf
|
[7] |
Shin H, Oh M, Choi G, et al. Exploring performance characteristics of ZNS SSDs: Observation and implication[C/OL] //Proc of the 9th Non-Volatile Memory Systems and Applications Symp (NVMSA). Piscataway, NJ: IEEE, 2020[2022-08-20].https://ieeexplore.ieee.org/abstract/document/9188086
|
[8] |
Park C, Cheon W, Kang J, et al. A reconfigurable FTL (flash translation layer) architecture for NAND flash-based applications[J/OL]. ACM Transactions on Embedded Computing Systems, 2008[2022-08-20].https://people.eecs.berkeley.edu/~kubitron/courses/cs262a/handouts/papers/a38-park.pdf
|
[9] |
Bux W, Iliadis I. Performance of greedy garbage collection in flash-based solid-state drives[J]. Performance Evaluation, 2010, 67(11): 1172−1186 doi: 10.1016/j.peva.2010.07.003
|
[10] |
Smith K. Understanding SSD over-provisioning [EB/OL]. 2022[2022-08-20].https://www.edn.com/understanding-ssd-over-provisioning
|
[11] |
Hu Xiaoyu, Eleftheriou E, Haas R, et al. Write amplification analysis in flash-based solid state drives[C/OL] //Proc of 2009 Israeli Experimental Systems Conf (SYSTOR). New York: ACM , 2009[2022-08-20]. http://roman.pletka.ch/publications/hu2009write-ampl.pdf
|
[12] |
Bjørling M. ZNS SDDs in Western Digital [EB/OL]. 2020[2022-08-20].https://snia.org/sites/default/files/SDC/2020/075-Bjørling-Zoned-Namespaces-ZNS-SSDs.pdf
|
[13] |
Western Digital. NVMe Zoned Namespaces (ZNS) SSDs [EB/OL]. 2021[2022-08-20].https://zonedstorage.io/docs/introduction/zns
|
[14] |
NVMe. Zoned Namespaces Command Set [EB/OL]. 2022[2022-08-20].https://nvmexpress.org/wp-content/uploads/NVM-Express-Zoned-Namespace-Command-Set-Specification-1.1b-2022.01.05-Ratified.pdf
|
[15] |
Roh H, Park S, Kim S, et al. B+-tree index optimization by exploiting internal parallelism of flash-based solid state drives[J]. Proceedings of the VLDB Endowment, 2011, 5(4): 286−297 doi: 10.14778/2095686.2095688
|
[16] |
Na G J, Lee S W, Moon B. Dynamic in-page logging for B+-tree index[J]. IEEE Transactions on Knowledge and Data Engineering, 2011, 24(7): 1231−1243
|
[17] |
Agrawal D, Ganesan D, Sitaraman R, et al. Lazy-adaptive tree: An optimized index structure for flash devices[J]. Proceedings of the VLDB Endowment, 2009, 2(1): 361−372 doi: 10.14778/1687627.1687669
|
[18] |
Ahn J S, Kang D, Jung D, et al. μ*-tree: An ordered index structure for NAND flash memory with adaptive page layout scheme[J]. IEEE Transactions on Computers, 2012, 62(4): 784−797
|
[19] |
Fang Huawei, Yeh M, Suei P, et al. An adaptive endurance-aware B+-tree for flash memory storage systems[J]. IEEE Transactions on Computers, 2013, 63(11): 2661−2673
|
[20] |
Jin Peiquan, Yang Chengcheng, Jensen C, et al. Read/write-optimized tree indexing for solid-state drives[J]. The VLDB Journal, 2016, 25(5): 695−717 doi: 10.1007/s00778-015-0406-1
|
[21] |
Lv Yanfei, Li Jing, Cui Bin, et al. Log-compact R-tree: An efficient spatial index for SSD[C]//Proc of the Int Conf on Database Systems for Advanced Applications (DASFAA 11). Berlin: Springer, 2011: 202−213
|
[22] |
Jin Peiquan, Yang Chengcheng, Wang Xiaoliang, et al. SAL-Hashing: A self-adaptive linear hashing index for SSDs[J]. IEEE Transactions on Knowledge and Data Engineering, 2020, 32(3): 519−532 doi: 10.1109/TKDE.2018.2884714
|
[23] |
Ho V, Park D. WPCB-tree: A novel flash-aware B-tree index using a write pattern converter [J/OL]. Symmetry, 2018[2022-08-20].https://www.mdpi.com/2073 − 8994/10/1/18/pdf
|
[24] |
Jin R, Cho H, Lee S, et al. Lazy-split B+-tree: A novel B+-tree index scheme for flash-based database systems[J]. Design Automation for Embedded Systems, 2013, 17(1): 167−191 doi: 10.1007/s10617-013-9123-4
|
[25] |
Twigg A, Byde A, Miłoś G, et al. Stratified B-trees and versioned dictionaries[C] //Proc of the 3rd Workshop on Hot Topics in Storage and File Systems (HotStorage 11). Berkeley, CA: USENIX Association, 2011: Article 3
|
[26] |
闫玮,张兴军,纪泽宇,等. 基于持久性内存的单向移动B+树[J]. 计算机研究与发展,2021,58(2):371−383 doi: 10.7544/issn1000-1239.2021.20200403
Yan Wei, Zhang Xingjun, Ji Zeyu, et al. One-direction shift B+-tree based on persistent memory[J]. Journal of Computer Research and Development, 2021, 58(2): 371−383 (in Chinese) doi: 10.7544/issn1000-1239.2021.20200403
|
[27] |
Western Digital. null_blk [EB/OL]. 2022[2022-10-15].https://zonedstorage.io/docs/getting-started/zns-device
|
[28] |
Western Digital. libzbd [EB/OL]. 2022[2022-08-20].https://github.com/westerndigitalcorporation/libzbd
|
[29] |
Cooper B F, Silberstein A, Tam E, et al. Benchmarking cloud serving systems with YCSB[C] //Proc of the 1st ACM Symp on Cloud Computing (SoCC 10). New York: ACM, 2010: 143−154
|
[1] | Wu Jingya, Lu Wenyan, Yan Guihai, Li Xiaowei. HyperTree: High Concurrent B+tree Index Accelerator[J]. Journal of Computer Research and Development, 2023, 60(7): 1661-1677. DOI: 10.7544/issn1000-1239.202111055 |
[2] | Yang Yongpeng, Jiang Dejun. A Method for Solving the wandering B+ tree Problem[J]. Journal of Computer Research and Development, 2023, 60(3): 539-554. DOI: 10.7544/issn1000-1239.202220555 |
[3] | Yan Wei, Zhang Xingjun, Ji Zeyu, Dong Xiaoshe, Ji Chenzhao. One-Direction Shift B+-Tree Based on Persistent Memory[J]. Journal of Computer Research and Development, 2021, 58(2): 371-383. DOI: 10.7544/issn1000-1239.2021.20200403 |
[4] | Te Rigen, Li Wei, and Li Xiongfei. Storage Model and Implementation of the Dynamic Ordered Tree[J]. Journal of Computer Research and Development, 2013, 50(5): 969-985. |
[5] | Shen Yan, Song Shunlin, Zhu Yuquan. Mining Algorithm of Association Rules Based on Disk Table Resident FP-TREE[J]. Journal of Computer Research and Development, 2012, 49(6): 1313-1322. |
[6] | Wang Hongqiang, Li Jianzhong, and Wang Hongzhi. Processing XPath over F&B-Index[J]. Journal of Computer Research and Development, 2010, 47(5): 866-877. |
[7] | Zhou Da, Liang Zhichao, Meng Xiaofeng. HF-Tree: An Update-Efficient Index for Flash Memory[J]. Journal of Computer Research and Development, 2010, 47(5): 832-840. |
[8] | Sun Xiaojuan, Sun Ninghui, Chen Mingyu. Optimization of B-NIDS for Multicore[J]. Journal of Computer Research and Development, 2007, 44(10): 1733-1740. |
[9] | Ju Dapeng, Li Ming, Hu Jinfeng, Wang Dongsheng, Zheng Weimin, and Ma Yongquan. An Algorithm of B\++ Tree Management in P2P Environment[J]. Journal of Computer Research and Development, 2005, 42(8): 1438-1444. |
[10] | Dong Daoguo, Liang Liuhong, and Xue Xiangyang. VAR-Tree—A New High-Dimensional Data Index Structure[J]. Journal of Computer Research and Development, 2005, 42(1): 10-17. |
1. |
LUO Haoran,HU Shuisong,WANG Wenyong,TANG Yuke,ZHOU Junwei. Research on Multi-Core Processor Analysis for WCET Estimation. ZTE Communications. 2024(01): 87-94 .
![]() |