• 中国精品科技期刊
  • CCF推荐A类中文期刊
  • 计算领域高质量科技期刊T1类
高级检索

针对瞬时故障和间歇性故障的NoC链路容错方法

欧阳一鸣, 孙成龙, 李建华, 梁华国, 黄正峰, 杜高明

欧阳一鸣, 孙成龙, 李建华, 梁华国, 黄正峰, 杜高明. 针对瞬时故障和间歇性故障的NoC链路容错方法[J]. 计算机研究与发展, 2017, 54(5): 1109-1120. DOI: 10.7544/issn1000-1239.2017.20151017
引用本文: 欧阳一鸣, 孙成龙, 李建华, 梁华国, 黄正峰, 杜高明. 针对瞬时故障和间歇性故障的NoC链路容错方法[J]. 计算机研究与发展, 2017, 54(5): 1109-1120. DOI: 10.7544/issn1000-1239.2017.20151017
Ouyang Yiming, Sun Chenglong, Li Jianhua, Liang Huaguo, Huang Zhengfeng, Du Gaoming. Addressing Transient and Intermittent Link Faults in NoC with Fault-Tolerant Method[J]. Journal of Computer Research and Development, 2017, 54(5): 1109-1120. DOI: 10.7544/issn1000-1239.2017.20151017
Citation: Ouyang Yiming, Sun Chenglong, Li Jianhua, Liang Huaguo, Huang Zhengfeng, Du Gaoming. Addressing Transient and Intermittent Link Faults in NoC with Fault-Tolerant Method[J]. Journal of Computer Research and Development, 2017, 54(5): 1109-1120. DOI: 10.7544/issn1000-1239.2017.20151017

针对瞬时故障和间歇性故障的NoC链路容错方法

基金项目: 国家自然科学基金项目(61474036,61274036,61371025,61574052);国家自然科学基金青年科学基金项目(61402145);安徽省自然科学基金青年基金项目(1508085QF138);安徽省自然科学基金项目(1508085MF117)
详细信息
  • 中图分类号: TP302

Addressing Transient and Intermittent Link Faults in NoC with Fault-Tolerant Method

  • 摘要: 片上网络中链路是路由器之间连接的关键通路,其发生故障将严重影响网络性能.针对这一问题,提出了一种针对瞬时和间歇性故障的高可靠链路容错方法,该方法可以在网络中实时检测数据是否发生错误,并以此定义瞬时故障和间歇性故障,从而进行容错.在减轻网络拥塞和延时的同时,保证了数据的正确传输,有效保障了系统的高可靠性.当链路中发生瞬时故障导致数据出错且不能正确纠正时,通过设置的重传缓冲区内备份的数据重新进行传输.当链路中发生间歇性故障导致数据出错且不能正确纠正时,数据包传输被截断,对被截断的数据重新添加头微片或尾微片,从而进行重新路由或资源释放.实验结果表明:该容错方法在不同故障情况下较对比对象,均较大地降低了延时,提高了吞吐率,该方法能有效地提高网络的可靠性,保证了系统性能.
    Abstract: As the link is the critical path between routers in NoC,it will seriously affect the network performance when faults occur in the link. For this reason, we propose a high reliable fault-tolerant method addressing transient and intermittent link faults. The method can detect real-time data error occurring in the network, and then define that whether the fault is transient fault or intermittent fault, thereby realizing fault-tolerance. As a result, it not only alleviates the network congestion and decreases the data delay, but also ensures the correct transmission of data, effectively guaranteeing the high reliability of the system. It is well known that when a transient fault occurs in the link, the fault link will result in a data error, which cannot be corrected properly. Therefore, the proposed method set up the retransmission buffer and then the backup data will be retransmitted. If an intermittent fault occurs, the packet transmission is truncated. To solve this problem, the proposed method adds a pseudo head flit and a pseudo tail flit to the truncated data, then re-routing begins and the occupied resource is released. Experimental results show that, in different fault conditions, this method outperforms the comparison objects with significant reduction in average packet latency and obvious improvement in throughput. In a word, this scheme can effectively improve network reliability in addition to ensuring network performance.
计量
  • 文章访问数: 
  • HTML全文浏览量:  0
  • PDF下载量: 
  • 被引次数: 0
出版历程
  • 发布日期:  2017-04-30

目录

    /

    返回文章
    返回