ISSN 1000-1239 CN 11-1777/TP

计算机研究与发展 ›› 2019, Vol. 56 ›› Issue (4): 692-707.doi: 10.7544/issn1000-1239.2019.20170973

• 综述 • 上一篇    下一篇

类Paxos共识算法研究进展

王江,章明星,武永卫,陈康,郑纬民   

  1. (清华大学计算机科学与技术系 北京 100084) (北京信息科学与技术国家研究中心 北京 100084) (清华大学深圳研究生院 广东深圳 518055) (jiang-wa15@mails.tsinghua.edu.cn)
  • 出版日期: 2019-04-01
  • 基金资助: 
    国家自然科学基金项目(61520106005,61761136014);国家重点研发计划项目(2017YFB1010000)

Paxos-like Consensus Algorithms: A Review

Wang Jiang, Zhang Mingxing, Wu Yongwei, Chen Kang, Zheng Weimin   

  1. (Department of Computer Science and Technology, Tsinghua University, Beijing 100084) (Beijing National Research Center for Information Science and Technology, Beijing 100084) (Graduate School at Shenzhen, Tsinghua University, Shenzhen, Guangdong 518055)
  • Online: 2019-04-01

摘要: 随着互联网数据量和业务量的快速增长,集群规模越来越大,由于机器和网络等故障使得业务中断的可能性越来越高.如何实现一个容错的分布式系统十分重要.多机达成共识是分布式容错系统中一个最基础,最核心的问题.Paxos等一系列共识算法的出现有效地解决了这个问题.近年来,越来越多的系统使用共识相关技术,关于分布式共识算法的研究也层出不穷.这些共识算法可以被划分成2个大类:强领导者共识算法和弱领导者共识算法.随着远程直接内存访问(remote direct memory access, RDMA)等网络技术和现场可编程门阵列(field-programmable gate array, FPGA)等硬件技术的发展,又出现了一些结合新型网络和硬件技术的共识算法研究,用来提升分布式系统的性能.将从分布式共识算法发展历程的角度,介绍Paxos系列算法,阐述算法演进过程中的关键研究,讨论相关算法在不同场景下的优劣势,并展望该类算法的未来发展方向与前景.

关键词: 分布式共识, 容错, Paxos算法, 强领导者共识, 弱领导者共识

Abstract: With the rapid growth of data volume and Web services, the cluster size is getting bigger and bigger in datacenters. The probability of service interruption grows dramatically due to machine and network failures. How to achieve a fault-tolerant distributed system becomes very important. State machine replication is one of the most general methods for building a fault-tolerant system, and distributed consensus problem is one of the most basic and core issues in replicated state machine systems. Paxos and a series of Paxos-like consensus algorithms can effectively solve this problem. In recent years, more and more systems use consensus-related techniques to ensure their reliability and availability, and studies on distributed consensus algorithms are also emerging in an endless stream. These consensus algorithms can be divided into two categories, leader-based consensus algorithms and leaderless consensus algorithms. With the development of network technologies such as remote direct memory access(RDMA) and hardware technologies such as field-programmable gate array(FPGA), some consensus algorithms combining with new network technologies and hardware technologies have appeared, which are used to improve the performance of distributed systems. In this paper, we introduce Paxos series algorithms from the perspective of the development of distributed consensus algorithms, discuss the advantages and disadvantages of the algorithms in different scenarios, and further give a future outlook on the research and application directions.

Key words: distributed consensus, fault tolerance, Paxos, leader-based consensus, leaderless consensus

中图分类号: