Abstract:
The speed gap between processor and memory is constantly widening, which substantially exacerbates the dependence of program performance on the on-chip memory hierarchy design in chip multiprocessors (CMP). However, traditional data management mechanism doesn't take advantage of the property of non-uniform cache access latency in large distributed cache in CMP, which causes the contradiction between miss rate and hit latency is increasingly serious. Furthermore, it is difficult to solve the problems of conflicting access and long latency hit to shared blocks by simply replying on dynamic migration and blind replication. Aiming at these challenges, this paper proposes an adaptive migration-replication (AMR) mechanism based on the virtual shared regions (VSR) partition in tiled CMP. Both the state of the victim candidate in local VSR and the activity degree of remote source line are taken into consideration cooperatively, so that the shared blocks accessed by different processor cores can be migrated and replicated between different VSRs adaptively, which results in the reduction of the average memory access time. Finally, the VSR partition and AMR mechanism are implemented using a full system simulator, and the typical benchmark suit SPLASH-2 is used to evaluate the performance improvement. Simulation results demonstrate that AMR performs well under different sharing degree compared with traditional fixed partition mechanism, while the additional hardware overhead is negligible.