Citation: | Tu Yaofeng, Han Yinjun, Jin Hao, Chen Zhenghua, Chen Bing. UStore: Unified Storage System for Advanced Hardware[J]. Journal of Computer Research and Development, 2023, 60(3): 525-538. DOI: 10.7544/issn1000-1239.202220503 |
The explosive growth of data scale has made distributed storage widely used. For a long time, distributed storage directly uses the local file system to access local storage resources. With the emergence of high-performance NVMe SSD, PMEM(persistent memory), and heterogeneous acceleration devices, it is difficult for the local file system to give full play to the features and performance advantages of new hardware. Many existing researches have optimized and improved the performance according to the hardware characteristics of SSD or PMEM at the software level. However, these studies have poor compatibility and scalability, cannot flexibly adapt to changes in the hardware environment, and lack a unified solution for new kinds of hardware. A unified storage system UStore that is compatible with multiple storage media is proposed, which can flexibly select storage media according to business scenarios, and optimize the combined design of typical hardware such as PMEM, KVS accelerator cards, and NVMe SSDs, and make full use of its hardware characteristics to meet multiple needs. Through a metadata design decoupled from the physical storage medium, UStore can adapt to the performance and atomic update ability of different hardwares, and realize a flexible metadata management strategy; through the efficient data management mechanism and update strategy, the log free data atomic write guarantee is realized, and the write amplification and performance jitter of the existing system are eliminated. The experimental results show that compared with BlueStore, UStore’s 4KB random read performance is improved by 3.2×, and the 4KB random write performance is improved by 8.2×. Under three typical hardware combinations, UStore shows matching data access characteristics, giving full play to the characteristics and performance of storage hardware.
[1] |
Aghayev A, Weil S, Kuchnik M, et al. File systems unfit as distributed storage backends: Lessons from 10 years of ceph evolution[C] //Proc of the 27th ACM Symp on Operating Systems Principles. New York: ACM, 2019: 353−369
|
[2] |
Dulloor S R, Kumar S, Keshavamurthy A, et al. System software for persistent memory[C/OL] //Proc of the 9th European Conf on Computer Systems (EuroSys’14). New York: ACM, 2014[2022-09-05].https://doi.org/10.1145/2592798.2592814
|
[3] |
Xu Jian, Swanson S. NOVA: A log-structured file system for hybrid volatile/non-volatile main memories[C] //Proc of the 14th USENIX Conf on File and Storage Technologies (FAST’16). Berkeley, CA: USENIX Association, 2016: 323−338
|
[4] |
Kannan S, Bhat N, Gavrilovska A, et al. Redesigning LSMs for nonvolatile memory with NoveLSM[C] //Proc of the 24th USENIX Annual Technical Conf (ATC’18). Berkeley, CA: USENIX Association, 2018: 993−1005
|
[5] |
Chen Youmin, Lu Youyou, Zhu Bohong. Scalable persistent memory file system with kernel-userspace collaboration[C] //Proc of the 19th USENIX Conf on File and Storage Technologies (FAST’21). Berkeley, CA: USENIX Association, 2021: 81−95
|
[6] |
Rho E, Joshi K, Shin S, et al. FStream: Managing flash streams in the file system[C] //Proc of the 16th USENIX Conf on File and Storage Technologies (FAST’18). Berkeley, CA: USENIX Association, 2018: 257−264
|
[7] |
Bjrling M, GonzalezJ, Bonnet P. LightNVM: The Linux open-channel SSD subsystem[C] //Proc of the 15th USENIX Conf on File and Storage Technologies (FAST’17). Berkeley, CA: USENIX Association, 2017: 359−374
|
[8] |
Matias B, Abutalib A, Hans H, et al. ZNS: Avoiding the block interface tax for flash-based SSDs[C] //Proc of the 27th USENIX Annual Technical Conf (ATC’21). Berkeley, CA: USENIX Association, 2021: 689−703
|
[9] |
Lu Lanyue, Pillai T S, Arpaci-Dusseau A C, et al. WiscKey: Separating keys from values in SSD-conscious storage[C] //Proc of the 14th USENIX Conf on File and Storage Technologies (FAST’16). Berkeley, CA: USENIX Association, 2016: 133−148
|
[10] |
Lepers B, Balmau O, Gupta K, et al. KVell: The design and implementation of a fast persistent key-value store[C] //Proc of the 27th ACM Symp on Operating Systems Principles. New York: ACM, 2019: 447−461
|
[11] |
Kwon Y, Fingler H, Hunt T, et al. Strata: A cross media file system[C] //Proc of the 26th Symp on Operating Systems Principles. New York: ACM, 2017: 460−477
|
[12] |
Kaiyrakhmet O, Lee S, Nam B, et al. SLM-DB: Single-level key-value store with persistent memory[C] //Proc of the 17th USENIX Conf on File and Storage Technologies (FAST’19). Berkeley, CA: USENIX Association, 2019: 191−205
|
[13] |
肖仁智,冯丹,胡燏翀,等. 面向非易失内存的数据一致性研究综述[J]. 计算机研究与发展,2020,57(1):85−101 doi: 10.7544/issn1000-1239.2020.20190062
Xiao Renzhi, Feng Dan, Hu Yuchong, et al. A survey of data consistency research for non-volatile memory[J]. Journal of Computer Research and Development, 2020, 57(1): 85−101 (in Chinese) doi: 10.7544/issn1000-1239.2020.20190062
|
[14] |
Hu Qingda, Ren Jinglei, Badam A, et al. Log-structured non-volatile main memory[C] //Proc of the 23rd USENIX Annual Technical Conf (ATC’17). Berkeley, CA, USA: USENIX Association, 2017: 703−717
|
[15] |
Chen Qichen, Lee H, Kim Y, et al. Design and implementation of SkipList-based key-value store on non-volatile memory[J]. Cluster Computing, 2019, 22(2): 361−371 doi: 10.1007/s10586-019-02925-1
|
[16] |
Huang Gui, Cheng Xuntao, Wang Jianying, et al. X-Engine: An optimized storage engine for large-scale e-commerce transaction processing[C] //Proc of the 45th Int Conf on Management of Data. New York: ACM, 2019: 651−665
|
[17] |
Dong Siying, Kryczka A, Jin Yanqin, et al. Evolution of development priorities in key-value stores serving large-scale applications: The RocksDB experience[C] //Proc of the 19th USENIX Conf on File and Storage Technologies (FAST’21). Berkeley, CA: USENIX Association, 2021: 1−32
|
[18] |
Intel. SPDK: Storage performance development kit[EB/OL]. [2022-05-28].https://spdk.io/
|
[19] |
安仲奇,张云尧,邢晶,等. 基于用户级融合I/O的Key-Value存储系统优化技术研究[J]. 计算机研究与发展,2020,57(3):649−659
An Zhongqi, Zhang Yunyao, Xing Jing, et al. Optimization of the Key-Value storage system based on fused user-level I/O[J]. Journal of Computer Research and Development, 2020, 57(3): 649−659 (in Chinese)
|
[20] |
Intel. PMDK: Persistent memory development kit[EB/OL]. [2022-05-28].https://PMEM.io/
|
[21] |
Intel. DAOS: Distributed asynchronous object storage[EB/OL]. [2022-05-28].https://docs.daos.io/v2.0/
|
[22] |
屠要峰, 陈正华, 韩银俊, 等. 基于持久内存和SSD的后端存储MixStore[J]. 计算机研究与发展, 2021, 58(2): 406−417
Tu Yaofeng, Chen Zhenghua, Han Yinjun, et al. MixStore: Back-end storage based on persistent memory and SSD[J]. Journal of Computer Research and Development, 2021, 58(2): 406−417(in Chinese)
|
[23] |
廖晓坚,杨者,杨洪章,等. 低CPU 开销的低延迟存储引擎[J]. 计算机研究与发展,2022,59(3):489−498
Liao Xiaojian, Yang Zhe, Yang Hongzhang, et al. A low-latency storage engine with low CPU overhead[J]. Journal of Computer Research and Development, 2022, 59(3): 489−498 (in Chinese)
|
[1] | Li Dongwen, Zhong Zhenyu, Sun Yufei, Shen Junyu, Ma Zizhi, Yu Chuanyue, Zhang Yuzhi. LingLong: A High-Quality Small-Scale Chinese Pre-trained Language Model[J]. Journal of Computer Research and Development, 2025, 62(3): 682-693. DOI: 10.7544/issn1000-1239.202330844 |
[2] | Cui Yuanning, Sun Zequn, Hu Wei. A Pre-trained Universal Knowledge Graph Reasoning Model Based on Rule Prompts[J]. Journal of Computer Research and Development, 2024, 61(8): 2030-2044. DOI: 10.7544/issn1000-1239.202440133 |
[3] | Chen Rui, Wang Zhanquan. Uni-LSDPM: A Unified Online Learning Session Dropout Prediction Model Based on Pre-Training[J]. Journal of Computer Research and Development, 2024, 61(2): 441-459. DOI: 10.7544/issn1000-1239.202220834 |
[4] | Zhang Naizhou, Cao Wei, Zhang Xiaojian, Li Shijun. Conversation Generation Based on Variational Attention Knowledge Selection and Pre-trained Language Model[J]. Journal of Computer Research and Development. DOI: 10.7544/issn1000-1239.202440551 |
[5] | Wang Yan, Tong Xiangrong. Cross-Domain Trust Prediction Based on tri-training and Extreme Learning Machine[J]. Journal of Computer Research and Development, 2022, 59(9): 2015-2026. DOI: 10.7544/issn1000-1239.20210467 |
[6] | Gu Yonghao, Huang Boqi, Wang Jigang, Tian Tian, Liu Yan, Wu Yuesheng. Trojan Traffic Detection Method Based on Semi-Supervised Deep Learning[J]. Journal of Computer Research and Development, 2022, 59(6): 1329-1342. DOI: 10.7544/issn1000-1239.20201014 |
[7] | Liu Zhuang, Liu Chang, Wayne Lin, Zhao Jun. Pretraining Financial Language Model with Multi-Task Learning for Financial Text Mining[J]. Journal of Computer Research and Development, 2021, 58(8): 1761-1772. DOI: 10.7544/issn1000-1239.2021.20210298 |
[8] | Zhang Dongjie, Huang Longtao, Zhang Rong, Xue Hui, Lin Junyu, Lu Yao. Fake Review Detection Based on Joint Topic and Sentiment Pre-Training Model[J]. Journal of Computer Research and Development, 2021, 58(7): 1385-1394. DOI: 10.7544/issn1000-1239.2021.20200817 |
[9] | Zhang Yong, Chen Rongrong, Zhang Jing. Safe Tri-training Algorithm Based on Cross Entropy[J]. Journal of Computer Research and Development, 2021, 58(1): 60-69. DOI: 10.7544/issn1000-1239.2021.20190838 |
[10] | Cheng Xiaoyang, Zhan Yongzhao, Mao Qirong, Zhan Zhicai. Video Semantic Analysis Based on Topographic Sparse Pre-Training CNN[J]. Journal of Computer Research and Development, 2018, 55(12): 2703-2714. DOI: 10.7544/issn1000-1239.2018.20170579 |