Abstract:
Shared-memory multithreaded programs running on chip multiprocessors tend to be nondeterministic. Two-phase deterministic record-replay is an effective approach to resolve this problem. Memory race recording is the key technology to replay multithreaded programs deterministically. It is significant to develop an efficient memory race recording scheme with both low log growth rate and rapid replay speed. A cyclic memory race recording algorithm based on point-to-point logging approach, named CyclicMR, is proposed. CyclicMR presents each memory race by using a new current dependency, uses hardware signatures with small size to detect memory races instead of cache memory, reduces the continuous memory races with same direction by a conflict direction detecting mechanism, and records an innovative cyclic dependency which can achieve much more transitivity. In this algorithm, all memory races recorded between two threads are loop-shaped, significantly reducing the redundancy of memory races. At the same time, cyclic dependency is further optimized by an incremental instruction counter, and the size of memory race is reduced a lot. Using an 8-core chip multiprocessor system, an exact comparison with earlier mainstream approaches is performed. The analysis results show that CyclicMR achieves small log growth rate, low hardware overhead and low bandwidth overhead. And it also has good scalability in memory race log.