Citation: | Hao Dongdong, Gao Congming, Shu Jiwu. Design of User Space Storage Architecture for SCSI Subsystem[J]. Journal of Computer Research and Development, 2025, 62(3): 633-647. DOI: 10.7544/issn1000-1239.202440632 |
For the past few years, the storage industry has undergone tremendous changes. Semiconductor storage devices, like solid state drives (SSDs), have flourished and are able to completely outperform traditional hard disk drives (HDDs), addressing data by moving magnetic head. Nowadays, the mainstream protocols supporting SSDs are NVMe and SAS. NVMe is a high-performance storage protocol designed specifically for SSDs that can maximize the performance of SSDs; while the SAS protocol fully considers the requirements of data centers, providing high reliability and high scalability while considering the balance between system performance and cost. Compared with the increasingly fast storage media, the time overhead of the software stack designed for slow storage devices in an I/O process is becoming increasingly significant. To address this issue, numerous excellent works have been proposed by academia and industry. For example, Intel’s SPDK (storage performance development kit) has greatly shortened the response time of NVMe SSD to applications by implementing device drivers in user space and polling I/O completion, extremely improving the performance of the entire system. However, previous research on the optimization of SAS SSD storage software stack is very limited. Therefore the SAS software stack optimization for SSD is implemented in user space. Experimental result shows that this optimization can effectively improve the data access efficiency with applications and storage devices. Besides, aiming to accurately evaluate the time cost of storage devices in I/O stack, a hardware performance testing tool HwPerfIO is proposed, which can eliminate the impact of most software overhead to measure the more accurate storage equipment performance.
[1] |
Goda A. 3-D NAND technology achievements and future scaling perspectives[J]. IEEE Transactions on Electron Devices, 2020, 67(4): 1373−1381 doi: 10.1109/TED.2020.2968079
|
[2] |
Baldassin A, Barreto J, Castro D. Persistent memory: A survey of programming support and implementations[J]. ACM Computing Surveys, 2021, 54(7): 1−37
|
[3] |
Shu Jiwu. Data Storage Architectures and Technologies [M]. Berlin: Springer, 2024
|
[4] |
Yang Z, Harris J R, Walker B, et al. SPDK: A development kit to build high performance storage applications [C] //Proc of the 2017 IEEE Int Conf on Cloud Computing Technology and Science (CloudCom). Piscataway, NJ: IEEE, 2017: 154–161
|
[5] |
Kim H J, Lee Y S, Kim J S. NVMeDirect: A user-space I/O framework for application-specific optimization on NVMe SSDs [C] //Proc of the 8th USENIX Conf on Hot Topics in Storage and File Systems. Berkeley, CA: USENIX Association, 2016: 41–45
|
[6] |
OpenMPDK. uNVMe: KV and LBA SSD userspace NVMe driver [CP/OL]. [2024-09-12]. https://github.com/OpenMPDK/uNVMe
|
[7] |
SNIA. Serial attached SCSI technology roadmap [EB/OL]. [2024-09-12]. https://www.snia.org/groups/scsi-trade-association-sta-forum/sas-roadmaps/serial-attached-scsi-technology-roadmap
|
[8] |
axboe. fio: Flexible I/O Tester [CP/OL]. [2024-09-12]. https://github.com/axboe/fio
|
[9] |
AMD. AMD I/O Virtualization Technology (IOMMU) Specification, Rev 3.09-PUB [S]. Santa Clara: Advanced Micro Devices Inc, 2023
|
[10] |
Shin J Y, Xia Zenglin, Xu Ningyi, et al. FTL design exploration in reconfigurable high-performance SSD for server applications [C] //Proc of the 23rd Int Conf on Supercomputing. New York: ACM, 2009: 338−349
|
[11] |
Serial ATA International Organization. Serial ATA Revision 3.1 [S]. Beaverton: Serial ATA International Organization, 2011
|
[12] |
Intel. Serial ATA AHCI 1.3. 1 Specification [S]. Santa Clara: Intel Corp, 2015
|
[13] |
NVM Express. Specifications: NVMe specifications overview [EB/OL]. [2024-09-12]. https://nvmexpress.org/specifications/
|
[14] |
SNIA. Building an all-flash array with SAS, NVMe or SATA [R]. Santa Clara: SNIA, 2018
|
[15] |
Shu Jiwu, Li Bigang, Zheng Weimin. Design and implementation of a SAN system based on the fiber channel protocol[J]. IEEE Transactions on Computers, 2005, 54(4): 439−448 doi: 10.1109/TC.2005.62
|
[16] |
Sun Microsystems, Inc. IP SAN fundamentals: An introduction to IP SANs and iSCSI [EB/OL]. [2024-09-12]. https://www.oracle.com/technetwork/systems/articles/ip-san-fundamentals-149896.pdf
|
[17] |
INCITS. SCSI Architecture Model-5 (SAM-5) [S]. Washington: American National Standards Institute, Inc, 2016
|
[18] |
The Linux Foundation. I/O latency optimization with polling [R]. San Francisco, CA: The Linux Foundation, 2017
|
[19] |
Huang Jian, Badam A, Qureshi M K, et al. Unified address translation for memory-mapped SSDs with FlashMap [C] //Proc of the 42nd Annual Int Symp on Computer Architecture. New York, ACM, 2015: 580−591
|
[20] |
Simon P, Li Jialin, Zhang I, et al. Arrakis: The operating system is the control plane [C/OL] //Proc of the 11th USENIX Conf on Operating Systems Design and Implementation. Berkeley, CA: USENIX Association, 2014 [2024-09-12]. https://www.usenix.org/system/files/conference/osdi14/osdi14-paper-peter_simon.pdf
|
[21] |
Gerhorst L. Analysis of interrupt handling overhead in the Linux kernel [D]. Erlangen: Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU), Department of Computer Science, 2018
|
[22] |
OASIS Open. Virtual I/O Device (VIRTIO) Version 1.2 [S]. Burlington: OASIS Open, 2022
|
[23] |
qemu. qemu: Official QEMU mirror [CP/OL]. [2024-09-12]. https://github.com/qemu/qemu
|
[24] |
PCI-SIG. PCI Express® Base Specification Revision 5.0, Version 1.0 [S]. Beaverton: PCI-SIG, 2019
|
[25] |
Michael S. Vhost [CP/OL]. [2024-09-12]. https://github.com/torvalds/linux/tree/master/drivers/vhost
|
[26] |
KIOXIA America, Inc. KIOXIA PM7-R series (2.5-inch) [EB/OL]. [2024-09-12]. https://americas.kioxia.com/en-us/business/ssd/enterprise-ssd/pm7-r.html
|
[27] |
Samsung Electronics Co, Ltd. A landmark release for Enterprise SSDs [EB/OL]. [2024-09-12]. https://semiconductor.samsung.com/cn/ssd/enterprise-ssd/pm1653/
|
[28] |
Broadcom Inc. MegaRAID 9670−24i [EB/OL]. [2024-09-12]. https://www.broadcom.com/products/storage/raid-controllers/megaraid-9670-24i
|
[29] |
Microchip Technology Inc. Adaptec® RAID adapters [EB/OL]. [2024-09-12]. https://www.microchip.com/en-us/products/storage/adaptec-smartraid-raid-adapters
|
[30] |
Shu Jiwu, Fang Kedong, Chen Youmin, et al. TH-iSSD: Design and implementation of a generic and reconfigurable near-data processing framework[J]. ACM Transactions on Embedded Computing Systems, 2022, 37(4): 96−119
|
[31] |
Gao Congming, Xin Xin, Lu Youyou, et al. ParaBit: Processing parallel bitwise operations in NAND flash memory based SSDs [C] //Proc of the 54th IEEE/ACM Int Symp on Microarchitecture (MICRO54). New York: ACM, 2021: 59−70
|
[32] |
Lu Youyou, Shu Jiwu, Wang Wei. ReconFS: A reconstructable file system on flash storage [C] //Proc of the 12th USENIX Conf on File and Storage Technologies (FAST). Berkeley, CA: USENIX Association, 2014: 75−88
|
[33] |
Zhang Jiacheng, Shu Jiwu, Lu Youyou. ParaFS: A log-structured file system to exploit the internal parallelism of flash devices [C] //Proc of the 2016 USENIX Annual Technical Conf (USENIX ATC). Berkeley, CA: USENIX Association, 2016: 87−100
|
[34] |
Shu Jiwu, Li Fei, Li Siyang, et al. Towards unaligned writes optimization in cloud storage with high-performance SSDs[J]. IEEE Transactions on Parallel and Distributed Systems, 2020, 31(12): 2923−2937 doi: 10.1109/TPDS.2020.3006655
|
[35] |
Li Weijia, Xue Wei, Shu Jiwu, et al. Dynamic hashing: Adaptive metadata management for petabyte-scale file systems [C] //Proc of the 23rd IEEE Conf on Mass Storage Systems and Technologies (NASA/MSST2006). Piscataway, NJ: IEEE, 2006: 159−164
|
[36] |
Bijlani A, Ramachandran U. Extension framework for file systems in user space [C] //Proc of the 2019 USENIX Annual Technical Conf (USENIX ATC). Berkeley, CA: USENIX Association, 2019: 121−134
|
[37] |
avfs. AVFS―A virtual filesystem [CP/OL]. [2024-09-12]. http://avf.sourceforge.net/
|
[38] |
Zhu Yue, Wang Teng, Mohror K, et al. Direct-FUSE: Removing the middleman for high-performance fuse file system support [C] //Proc of the 8th Int Workshop on Runtime and Operating Systems for Super-computers. New York, ACM, 2018 [2024-09-12]. https://www.osti.gov/servlets/purl/1458703
|
[39] |
Huai Qianbo, Hsu W, Lu Jiwei, et al. XFUSE: An infrastructure for running filesystem services in user space [C] //Proc of the 2021 USENIX Annual Technical Conf (USENIX ATC). Berkeley, CA: USENIX Association, 2021: 863–875
|
[40] |
Caulfield A M, Mollov T I, Eisner L A, et al. Providing safe, user space access to fast, solid state disks[J]. ACM SIGPLAN Notices, 2012, 47(4): 387−400 doi: 10.1145/2248487.2151017
|
[41] |
libfuse. libfuse: The reference implementation of the Linux FUSE (Filesystem in Userspace) interface [CP/OL]. [2024-09-12]. https://github.com/libfuse/libfuse
|
[42] |
Oracle. vdbench users guide [EB/OL]. [2024-09-12]. https://www.oracle.com/technetwork/server-storage/vdbench-1901683.pdf
|
[43] |
Corbet J, Rubini A, Kroah-Hartman G. Linux Device Drivers 3rd Edition: Memory Mapping and DMA [M]. Sebastopol: O’Reilly Media, 2006
|
[44] |
INCITS. SAS Protocol Layer-5 (SPL-5) [S]. Washington: American National Standards Institute, Inc, 2023
|
[45] |
Sathya P. mpt3sas [CP/OL]. [2024-09-12]. https://github.com/torvalds/linux/tree/master/drivers/scsi/mpt3sas
|
[46] |
Alex W. VFIO [CP/OL]. [2024-09-12]. https://github.com/torvalds/linux/tree/master/drivers/vfio
|
[47] |
Broadcom Inc. HBA 9500−8e tri-mode storage adapter [EB/OL]. [2024-09-12]. https://www.broadcom.com/products/storage/host-bus-adapters/sas-nvme-9500-8e
|
[48] |
filebench. filebench: File system and storage benchmark that uses a custom language to generate a large variety of workloads [CP/OL]. [2024-09-12]. https://github.com/filebench/filebench
|
[49] |
Intel. bdevperf [EB/OL]. [2024-09-12]. https://spdk.io/doc/bdevperf.html
|
[50] |
The Linux Foundation. User interrupts―A faster way to signal [R]. San Francisco, CA: The Linux Foundation, 2021
|
[1] | Wu Tianxing, Cao Xudong, Bi Sheng, Chen Ya, Cai Pingqiang, Sha Hangyu, Qi Guilin, Wang Haofen. Constructing Health Management Information System for Major Chronic Diseases Based on Large Language Model[J]. Journal of Computer Research and Development. DOI: 10.7544/issn1000-1239.202440570 |
[2] | Zhao Yun, Liu Dexi, Wan Changxuan, Liu Xiping, Liao Guoqiong. Mental Health Text Matching Model Integrating Characters’ Mental Portrait[J]. Journal of Computer Research and Development, 2024, 61(7): 1812-1824. DOI: 10.7544/issn1000-1239.202220987 |
[3] | Fu Tao, Chen Zhaojiong, Ye Dongyi. GAN-Based Bidirectional Decoding Feature Fusion Extrapolation Algorithm of Chinese Landscape Painting[J]. Journal of Computer Research and Development, 2022, 59(12): 2816-2830. DOI: 10.7544/issn1000-1239.20210830 |
[4] | Gan Xinbiao, Tan Wen, Liu Jie. Bidirectional-Bitmap Based CSR for Reducing Large-Scale Graph Space[J]. Journal of Computer Research and Development, 2021, 58(3): 458-466. DOI: 10.7544/issn1000-1239.2021.20200090 |
[5] | Zhou Donghao, Han Wenbao, Wang Yongjun. A Fine-Grained Information Diffusion Model Based on Node Attributes and Content Features[J]. Journal of Computer Research and Development, 2015, 52(1): 156-166. DOI: 10.7544/issn1000-1239.2015.20130915 |
[6] | Li Yaxiong, Zhang Jianqiang, Pan Deng, Hu Dan. A Study of Speech Recognition Based on RNN-RBM Language Model[J]. Journal of Computer Research and Development, 2014, 51(9): 1936-1944. DOI: 10.7544/issn1000-1239.2014.20140211 |
[7] | Huang He, Sun Yu'e, Chen Zhili, Xu Hongli, Xing Kai, Chen Guoliang. Completely-Competitive-Equilibrium-Based Double Spectrum Auction Mechanism[J]. Journal of Computer Research and Development, 2014, 51(3): 479-490. |
[8] | Zhu Feng, Luo Limin, Song Yuqing, Chen Jianmei, Zuo Xin. Adaptive Spatially Neighborhood Information Gaussian Mixture Model for Image Segmentation[J]. Journal of Computer Research and Development, 2011, 48(11): 2000-2007. |
[9] | Ma Xiao, Wang Xuan, and Wang Xiaolong. The Information Model for a Class of Imperfect Information Game[J]. Journal of Computer Research and Development, 2010, 47(12). |
[10] | Ma Liang, Chen Qunxiu, and Cai Lianhong. An Improved Model for Adaptive Text Information Filtering[J]. Journal of Computer Research and Development, 2005, 42(1): 79-84. |