ISSN 1000-1239 CN 11-1777/TP

Journal of Computer Research and Development ›› 2016, Vol. 53 ›› Issue (9): 1943-1952.doi: 10.7544/issn1000/1239.2016.20148367

Previous Articles     Next Articles

An Approach for Identifying SDC-Causing Instructions by Fault Propagation Analysis

Ma Junchi1,2, Wang Yun1,2, Cai Zhenbo3, Zhang Qingxiang3, Wang Ying3, Hu Cheng1,2   

  1. 1(School of Computer Science & Engineering, Southeast University, Nanjing 211189);2(Key Laboratory of Computer Network and Information Integration(Southeast University), Ministry of Education, Nanjing 211189);3(Institute of Spacecraft System Engineering, China Academy of Space Technology, Beijing 100094)
  • Online:2016-09-01

Abstract: Single event upset (SEU) is caused by external radiation in outer space and it has a great influence on computing reliability of space devices. As process technology scales, space devices become more susceptible to SEU. SEU could result in silent data corruption (SDC), which means wrong outcomes of a program without any crash detected. SDC may lead to serious failure and hence cannot be ignored. As SDC-causing fault always propagates silently, it is very difficult to detect SDC. To develop SDC detectors, SDC-causing instructions of a program should be identified as the first step. However, this step usually needs a huge number of fault injections, which is extremely time-consuming and not achievable for most applications. In this paper, we build data dependence graph (DDG) to capture the dependencies among the values of instructions. Then the inter-function and intra-function propagation that leads to SDC is analyzed and the sufficient condition of SDC-causing instructions is demonstrated. Further, we propose a novel method of identifying SDC-causing instructions. Taking advantage of the trace files of injection, our method can detect underlying SDC-causing instructions without any expensive computations. Validation efforts show that our method yields high accuracy and coverage rate with a great reduction of injection cost.

Key words: single event upset (SEU), soft error, SDC-causing instruction, fault injection, fault propagation

CLC Number: