Abstract:
Debugging multi-core parallel program is a well-known difficult problem. The key problem is that parallel problem may introduce many non-deterministic factors. Replay debugging is a promising method to eliminate non-deterministic. However, the state-of-art replay debugging solutions are not suitable for commercial software and hardware architecture. With the growth of concurrent degree, current replay debug method may also have unaccepted overhead. We propose a practical and novel replay debugging scheme name SDT (snapshot debug tool). The key innovation of SDT is using offline breakpoint and abstracting replay execution, instead of performing typical and physical replay execution. SDT can apply on commercial operate system and hardware, while also providing a gradually refined debugging method. According to the experimental results, using SDT will introduce 5188% extra execution time in average when using 8 threads. When the thread count increases from 1x to 4x, the overhead of SDT debugging will only increase from 1x to 2x, which shows that SDT has strong scalability. It’s a great challenge for SDT to record a large amount of data. The incremental snapshot capture used in our experiments has been proved that it can be effective to reduce the time and data which need to be record so that to improve the SDT performance.