ISSN 1000-1239 CN 11-1777/TP

Journal of Computer Research and Development ›› 2015, Vol. 52 ›› Issue (6): 1266-1277.doi: 10.7544/issn1000-1239.2015.20150160

Special Issue: 2015面向应用领域需求的体系结构

Previous Articles     Next Articles

A Trace-Driven Simulation of Memory System in Multithread Applications

Zhu Pengfei1,3, Lu Tianyue2,3, Chen Mingyu2   

  1. 1(State Key Laboratory of Computer Architecture(Institute of Computing Technology, Chinese Academy of Sciences), Beijing 100190);2(Center for Advanced Computing Systems, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190);3(University of Chinese Academy of Sciences, Beijing 100049)
  • Online:2015-06-01

Abstract: Nowadays, chip-multiprocessors (CMPs) become significantly important for multithread applications due to their high-throughput performance in big data computing. But growing latency to memory is increasingly impacting system performance because of memory wall. Two independent simulation methods: trace-driven and execution-driven, are available for system researchers to study and evaluate the memory system. On one hand, in order to leverage simulation speed, researchers employ trace-driven simulation because it removes data processing and is faster than execution-driven counterpart. On the other hand, lack of data processing induces both global and local trace misplacements, which never exist in multithread applications on real machine. Through analytical modeling, remarkable performance metrics variations are observed due to trace misplacements. Basically speaking, the reasons are in trace-driven simulation: 1)locks do not prevent threads from non-exclusively entering critical range; 2)barriers do not synchronize threads as need; 3)the dependence among memory operations is violated. In order to improve memory system simulation accuracy in multithread applications, a methodology is designed to eliminate both global and local trace misplacement in trace-driven simulation. As shown in experiments, eliminating global trace misplacement of memory operation induces up to 10.22% reduction in various IPC metrics, while eliminating local trace misplacement of memory operation induces at least 50% reduction in arithmetic mean of IPC metrics. The proposed methodology ensures multithread application’s invariability in trace-driven simulation.

Key words: trace-driven simulation, accuracy, memory system, multithread applications, trace collection and replay

CLC Number: