ISSN 1000-1239 CN 11-1777/TP

Journal of Computer Research and Development ›› 2020, Vol. 57 ›› Issue (3): 649-659.doi: 10.7544/issn1000-1239.2020.20180799

Previous Articles     Next Articles

Optimization of the Key-Value Storage System Based on Fused User-Level I/O

An Zhongqi1, Zhang Yunyao1,2, Xing Jing1, Huo Zhigang1,2   

  1. 1(State Key Laboratory of Computer Architecture (Institute of Computing Technology, Chinese Academy of Sciences), Beijing 100190);2(School of Computer and Control Engineering, University of Chinese Academy of Sciences, Beijing 100049)
  • Online:2020-03-01
  • Supported by: 
    This work was supported by the National Key Research and Development Program of China (2018YFC0809300) and the National Natural Science Foundation of China for Young Scientists (61502454).

Abstract: The traditional distributed key-value storage systems are commonly designed around the conventional Socket and POSIX I/O interfaces. Limited by the interface semantics and OS kernel overhead, it is difficult for such key-value systems to achieve high efficiency on modern high-performance network and storage hardware. In this paper, we propose a fused user-level I/O approach to improve the throughput performance and latency consistency for key-value systems based on high-speed Ethernet and NVMe SSDs. The control plane of the proposed I/O stack utilizes one single processor core and one single context to cooperatively manage the hardware queues of both the NIC and the SSD devices. The overheads of kernel mode entering, interrupts and context switches and inter-core communications are eliminated. The data plane is driven by a unified memory pool for fused I/O access, and the data is directly transferred between the key-value system and the device hardware without extra data copies. For requests with large-size payload, data is sliced and fed into different DMA stages and the latency is further hidden through pipelining and overlapping. We present UKV, an all-in-userland key-value system with support of a two-level DRAM-SSD storage hierarchy and the widely-used Memcache interface. The experimental results indicate that, compared with Fatcache, the QPS of SSD-involved SET requests is increased by 14.97%~97.78%, and the QPS of the GET operation is increased by 14.60%~51.81%. The p95 latency of SSD-involved SET requests is reduced by 26.12%~40.90%, and the p95 latency of GET operations is reduced by 15.10%~24.36%.

Key words: key-value storage system, kernel-bypass, user-space fused I/O, high-speed Ethernet, NVMe SSD

CLC Number: