A Low-Latency Storage Engine with Low CPU Overhead

Liao Xiaojian; Yang Zhe; Yang Hongzhang; Tu Yaofeng; Shu Jiwu

doi:10.7544/issn1000-1239.20210574

Journal of Computer Research and Development > 2022 > 59(3): 489-498. > DOI: 10.7544/issn1000-1239.20210574

Liao Xiaojian, Yang Zhe, Yang Hongzhang, Tu Yaofeng, Shu Jiwu. A Low-Latency Storage Engine with Low CPU Overhead[J]. Journal of Computer Research and Development, 2022, 59(3): 489-498. DOI: 10.7544/issn1000-1239.20210574

Citation:

PDF (769 KB)

A Low-Latency Storage Engine with Low CPU Overhead

¹(Department of Computer Science and Technology, Tsinghua University, Beijing 100084)
²(ZTE Corporation, Nanjing 210012)

Funds: This work was supported by the National Key Research and Development Program of China (2018YFB1003301), the National Natural Science Foundation of China (61832011), and the Project of ZTE (20182002008).

More Information

Published Date: February 28, 2022

Graphical Abstract

Abstract

Abstract

The latency of solid-state drive (SSD) has improved dramatically in recent years. For example, an ultra-low latency SSD can process 4 KB data in 10 microseconds. With this low latency, how to reap the I/O completion efficiently becomes an important issue in modern storage systems. Traditional storage systems reap I/O completion through hardware interrupt, which introduces extra context switches overhead and further prolongs the overall I/O latency. Existing work use polling as an alternative to the hardware interrupt, thereby eliminating the context switches, but at the cost of high CPU consumption. This paper proposes a CPU-efficient and low-latency storage engine namely NIO, to take full advantage of the ultra-low latency SSDs. The key idea of NIO is to separate the I/O paths of short I/Os from that of long I/Os; NIO uses classic hardware interrupt for long I/Os, as polling long I/Os does not bring significant improvement but incurs huge CPU overhead; for short I/Os, NIO introduces lazy polling, which lets the I/O thread sleep for a variable time interval before continuously polling, thereby achieving low latency with low CPU consumption. NIO further introduces transaction-aware I/O reaping mechanism to reduce the transaction latency, and a dynamic adjustment mechanism to cope with the dynamic changes of the workload and internal activities of the device. Under dynamic workloads, NIO shows comparable performance against polling-based storage engine while reducing the CPU consumption by at least 59%.
- storage system,
- non-volatile memory express (NVMe) I/O stack,
- solid-state drive,
- polling,
- interrupt

FullText(HTML)

References (0)

[1]	Wang Zirui, Jiang Dejun. Key Techniques of Swapping Mechanism Based on Ultra-Low Latency SSD[J]. Journal of Computer Research and Development, 2024, 61(3): 557-570. DOI: 10.7544/issn1000-1239.202330538
[2]	Wen Yuhong, Zhou You, Wu Qiulin, Wu Fei, Xie Changsheng. Quality of Service Guaranty Technology of Multi-Tenant Solid-State Drives: A Survey[J]. Journal of Computer Research and Development, 2023, 60(3): 555-571. DOI: 10.7544/issn1000-1239.202220561
[3]	Yu Tingting, Li Chao, Wang Boxiang, Chen Rui, Jiang Yunsong. Atomicity Violation Detection for Interrupt-Driven Aerospace Embedded Software[J]. Journal of Computer Research and Development, 2023, 60(2): 294-310. DOI: 10.7544/issn1000-1239.202220908
[4]	Gao Congming, Shi Liang, Liu Kai, Xue Chun, Shu Jiwu. Architecture and Technologies of Flash Memory Based Solid State Drives[J]. Journal of Computer Research and Development, 2021, 58(7): 1518-1532. DOI: 10.7544/issn1000-1239.2021.20200690
[5]	An Zhongqi, Zhang Yunyao, Xing Jing, Huo Zhigang. Optimization of the Key-Value Storage System Based on Fused User-Level I/O[J]. Journal of Computer Research and Development, 2020, 57(3): 649-659. DOI: 10.7544/issn1000-1239.2020.20180799
[6]	Niu Dejiao, He Qingjian, Cai Tao, Wang Jie, Zhan Yongzhao, Liang Jun. APMSS: The New Solid Storage System with Asymmetric Interface[J]. Journal of Computer Research and Development, 2018, 55(9): 2083-2093. DOI: 10.7544/issn1000-1239.2018.20180198
[7]	Chen Yubiao, Li Jianzhong, Li Yingshu, Li Faming, Gao Hong. R-Tree Optimization Method Using Internal Parallelism of Flash Memory-Based Solid-State Drives[J]. Journal of Computer Research and Development, 2018, 55(9): 2066-2082. DOI: 10.7544/issn1000-1239.2018.20180254
[8]	Xia Nu, Li Wei, Lu You, Jiang Jian, Shan Feng, Luo Junzhou. The Redundant Tree Algorithm for Uninterrupted Data Delivery[J]. Journal of Computer Research and Development, 2016, 53(9): 2016-2029. DOI: 10.7544/issn1000-1239.2016.20150380
[9]	Huo Wei, Yu Hongtao, Feng Xiaobing, and Zhang Zhaoqing. Static Race Detection of Interrupt-Driven Programs[J]. Journal of Computer Research and Development, 2011, 48(12): 2290-2299.
[10]	Zhou Huaqi, Lu Mingming, and Zhu Hong. A Preemptive Scheduling Problem with Different Interrupted Time Cost on Identical Parallel Machines[J]. Journal of Computer Research and Development, 2005, 42(3).