一种远程直接内存访问网络中的高效分布式锁协议

高健; 舒继武

doi:10.7544/issn1000-1239.202550361

一种远程直接内存访问网络中的高效分布式锁协议

高健,
舒继武

An Efficient Lock Protocol in Remote Direct Memory Access Network

Gao Jian,
Shu Jiwu

摘要

摘要: 分布式锁是分布式存储系统的重要组件，锁协议的性能对系统整体的性能有关键性影响. 远程直接内存访问（remote direct memory access，RDMA）是一种新兴的数据中心网络技术，它支持单边网络通信原语，可以降低系统CPU开销，同时具备低延迟、高吞吐的性能特性，为设计高速分布式锁协议提供了新机遇. 然而，设计基于RDMA的分布式锁协议面临诸多挑战. 着重在保证高性能的前提下解决扩展性和公平性挑战，提出一种RDMA网络中的高性能分布式锁协议FeLock，它利用多种类型的RDMA网络通信原语，使客户端不仅能与服务端通信加解锁，还能与其他客户端直接通信以移交锁所有权，同时实现了高性能、公平性和性能的扩展性. 具体地，为保证高性能，FeLock引入了节点粒度锁管理机制，缩减锁协议在关键路径上的网络往返次数. 为实现扩展性，FeLock引入了轮转移交机制，将所有节点排成1个环，客户端按照其在环中的顺序依次移交锁的所有权. 为实现公平性、避免客户端饥饿，FeLock引入了节点信用机制，限制节点连续加锁的次数，避免其他节点上的客户端无法加锁. 实验显示，FeLock相比于现有单边RDMA锁协议（如DSLR）表现出相似或更高的性能，并且具有更好的公平性和扩展性. 在3~120个客户端的环境下，FeLock的吞吐量达到DSLR的1.01~7.51倍，公平性提升至多2.24倍.

Abstract: Distributed lock is a crucial component in distributed storage systems. The performance of the lock protocol significantly influences the overall performance of the entire system. Remote direct memory access (RDMA) is an emerging data center networking technology that supports one-sided communication verbs and offers low CPU overhead, low latency, and high throughput. It presents new opportunities for designing high-performance distributed lock protocols. However, designing such protocols atop the RDMA network faces significant challenges, for example scalability and fairness. This paper addresses these challenges by proposing FeLock, an RDMA-based distributed lock protocol. FeLock achieves high performance while tackling the above challenges by leveraging different RDMA communication verbs, which enables the clients to communicate directly with both the server for lock acquisition and release and other clients to hand over lock ownership. Specifically, first, to improve performance, FeLock introduces a per-node lock management mechanism to reduce network roundtrips on the critical paths of the lock protocol. Second, to achieve scalability, FeLock incorporates a round-robin handover mechanism, in which nodes are logically organized into a ring, and clients hand over lock ownership sequentially according to their positions within the ring. Third, to ensure fairness and prevent clients from starvation, FeLock employs a node credit mechanism that limits the number of consecutive lock acquisitions by any single node, thereby preventing clients on any node from being indefinitely blocked by others. Experiment results demonstrate that FeLock achieves performance comparable to or exceeding that of existing one-sided RDMA lock protocols, such as DSLR, while exhibiting better fairness and scalability. With 3 to 120 clients, FeLock achieves throughput 1.01 to 7.51 times of DSLR, with its fairness improved by up to 2.24 times.

HTML全文

参考文献(23)

施引文献

资源附件(0)