Abstract:
Byzantine fault tolerance algorithm is one kind of fault-tolerant algorithms which can tolerate various software errors and system vulnerabilities. It is of vital importance to the reliability of cloud computing. Compared with other fault-tolerant algorithms, such as proof-of-work (PoW), Byzantine fault tolerance algorithm is much more stable, however, its poor performance cannot meet the demand of cloud computing which requires high throughput and low latency. In-network computing is a data-centric architecture that uses the network to perform some calculations. Using in-network computing, data can be processed as it moves, thereby improving system performance. To solve the performance problem of Byzantine fault tolerant system, in this paper, we propose a Byzantine fault tolerance algorithm optimization strategy with in-network computing, which offloads some of the computational tasks to the network interface card (NIC). The processor and NIC form a multi-stage pipeline which helps us improve the system throughput. Simply using in-network computing can not meet the performance goals in all scenarios, hence we utilize multi-threading technology to scale the system. We evaluate our method on real testbed, and the experimental results show that, compared with the default Byzantine fault tolerant system, we can obtain 46% improvement in overall throughput and 65% decrease in latency. The results have proved our solution to be available and effective.