Abstract:
With the rising and development of cloud computing, more and more irregular data intensive applications based on chip multi-core processors (CMP) appear, their application performances are badly affected by data cache misses. Traditional methods based on helper thread running in idle cores try to push irregular data into the shared last layer cache (LLC) in advance, which will be soon used by a computing core. If the helper thread runs faster than the main thread, the helper thread can push hot-data into LLC before the main thread uses them, thus the performance of hot slice may be improved. But for the hot-slice with low computing workload, it is impossible to build a helper thread running faster than the main thread by traditional method. This paper is aimed at the performance optimization of irregular data intensive hot-slice with low computing workload. First, the formalization description of the prefetch QoS of the helper thread is given, and then a new performance optimization method is proposed. The new method is implemented in real commercial processors without involving additional hardware modifications. Measurement results show that the performance of science computing benchmark em3d, mst and SPEC CPU2006 mcf gets the increases of 1.97%, 31.63% and 1.10% respectively compared with the traditional method in Intel Core 2 Duo Processor 6550.