ISSN 1000-1239 CN 11-1777/TP

Journal of Computer Research and Development ›› 2020, Vol. 57 ›› Issue (11): 2432-2441.doi: 10.7544/issn1000-1239.2020.20190551

Previous Articles     Next Articles

Optimization of LSM-Tree for Key-Value Stores

Wu Shangyu, Xie Jingwen, Wang Yi   

  1. (College of Computer Science and Software Engineering, Shenzhen University, Shenzhen, Guangdong 518060)
  • Online:2020-11-01
  • Supported by: 
    This work was supported by the National Natural Science Foundation of China (61972259), the Guangdong Natural Science Foundation-Distinguished Young Scholars (2019B151502055), the Guangdong Natural Science Foundation (2017B030314073), and the Shenzhen Natural Science Foundation (JCYJ20170817100300603).

Abstract: LSM-Tree (log-structured merge tree) is widely used in contemporary mainstream key-value storage systems to process the vast amount of data. LSM-Tree converts random write requests into sequential write requests by maintaining a batch of requests in memory to achieve high write efficiency. However, there are still two shortages in LSM-Tree: First, the flow direction of data is unidirectional and fixed. Data stored at the bottom of LSM-Tree will remain there until they are deleted by compaction operation. This will make read amplification problem more serious. Second, the data distribution in LSM-Tree does not reflect access frequency. The data with different access frequencies could be stored in the same physical location. The data with high access frequency in lower layer will cost higher access latency. This paper presents FloatKV (floating key-value), an access frequency aware key-value storage strategy. FloatKV first proposes a structure called LRFO (LRU and FIFO) to manage data stored in memory. Then FloatKV presents a new strategy for floating process in external storage to move data closer to memory. FloatKV records the access frequency of data stored in the external storage and adjusts the storage location based on its access frequency. To verify the feasibility and performance of FloatKV, we conduct a series of experiments using standard benchmarks from YCSB (yahoo! cloud serving benchmark) and compare FloatKV with representative key-value store techniques. The experimental results show that FloatKV can significantly improve the reading efficiency and effectively reduce the read amplification problem.

Key words: computer architecture, key-value storage, LSM-Tree(log-structured merge tree), access frequency, data floating

CLC Number: