Optimization of LSM-Tree for Key-Value Stores
-
Graphical Abstract
-
Abstract
LSM-Tree (log-structured merge tree) is widely used in contemporary mainstream key-value storage systems to process the vast amount of data. LSM-Tree converts random write requests into sequential write requests by maintaining a batch of requests in memory to achieve high write efficiency. However, there are still two shortages in LSM-Tree: First, the flow direction of data is unidirectional and fixed. Data stored at the bottom of LSM-Tree will remain there until they are deleted by compaction operation. This will make read amplification problem more serious. Second, the data distribution in LSM-Tree does not reflect access frequency. The data with different access frequencies could be stored in the same physical location. The data with high access frequency in lower layer will cost higher access latency. This paper presents FloatKV (floating key-value), an access frequency aware key-value storage strategy. FloatKV first proposes a structure called LRFO (LRU and FIFO) to manage data stored in memory. Then FloatKV presents a new strategy for floating process in external storage to move data closer to memory. FloatKV records the access frequency of data stored in the external storage and adjusts the storage location based on its access frequency. To verify the feasibility and performance of FloatKV, we conduct a series of experiments using standard benchmarks from YCSB (yahoo! cloud serving benchmark) and compare FloatKV with representative key-value store techniques. The experimental results show that FloatKV can significantly improve the reading efficiency and effectively reduce the read amplification problem.
-
-