Addressing Transient and Intermittent Link Faults in NoC with Fault-Tolerant Method
-
Graphical Abstract
-
Abstract
As the link is the critical path between routers in NoC,it will seriously affect the network performance when faults occur in the link. For this reason, we propose a high reliable fault-tolerant method addressing transient and intermittent link faults. The method can detect real-time data error occurring in the network, and then define that whether the fault is transient fault or intermittent fault, thereby realizing fault-tolerance. As a result, it not only alleviates the network congestion and decreases the data delay, but also ensures the correct transmission of data, effectively guaranteeing the high reliability of the system. It is well known that when a transient fault occurs in the link, the fault link will result in a data error, which cannot be corrected properly. Therefore, the proposed method set up the retransmission buffer and then the backup data will be retransmitted. If an intermittent fault occurs, the packet transmission is truncated. To solve this problem, the proposed method adds a pseudo head flit and a pseudo tail flit to the truncated data, then re-routing begins and the occupied resource is released. Experimental results show that, in different fault conditions, this method outperforms the comparison objects with significant reduction in average packet latency and obvious improvement in throughput. In a word, this scheme can effectively improve network reliability in addition to ensuring network performance.
-
-