Abstract:
Persistent memory and RDMA (remote direct memory access) provide high bandwidth and low latency to storage systems, and this brings new opportunities for designing high performance distributed storage system. However, their new features raise many challenges for data consistency management. On the one hand, to consistently update data in persistent memory, one needs to actively execute hardware instructions to flush data out of the CPU cache, and such instructions can lead to extremely high overhead and seriously affect the CPU performance. On the other hand, RDMA can directly read and write remote memory without the involvements of the remote CPU. Therefore, the server CPU is unaware of the remote writing events thus fails to perform data flushing. In case of system failures, the data will be in an inconsistent state. Regarding the above two problems, this paper proposes CCM, a consistency mechanism for distributed persistent memory file system. Firstly, we design and implement a consistency strategy based on persistent operation log to maintain system consistency by writing operation information to log and persisting it. Secondly, we design a consistency strategy from client side to server side, which enables the remote CPU to actively flush data when the data transferring is completed. Lastly, we implement an asynchronous data flushing at server side to improve system performance. Our experimental results show that the write bandwidth can occupy 88% of network’s raw bandwidth. Compared with Octopus, the state-of-the-art distributed file system, CCM only shows a performance reduction of less than 1%.