Abstract:
As technology advances and silicon feature size shrinks, computer system is faced with inevitably increasing risks of transient-fault susceptibility. Accordingly, processor dependability and trustworthiness have become the major problems of the application systems. Recently, much work has been done at different levels to accomplish fault-tolerance in processor systems against transient-faults. In this paper, a novel and comprehensive taxonomy of the latest processor fault-tolerance researches is put forward. Based on this taxonomy, the techniques of incorporating fault-tolerance, especially transient fault-tolerance, in modern processor systems at different levels are reviewed. Some important processor fault-tolerance architectures and representative researches are also briefly introduced and analyzed. Finally, some valuable advice and possible trends in processor fault-tolerance researches are proposed, hoping they will benefit related researchers.