The Tradeoff Cache Between Latency and Capacity in Chip Multiprocessors

Xiao Junhua; Feng Zijun; Zhang Longbing

Xiao Junhua, Feng Zijun, Zhang Longbing. The Tradeoff Cache Between Latency and Capacity in Chip MultiprocessorsJ. Journal of Computer Research and Development, 2009, 46(1): 167-175.

Citation:

Xiao Junhua, Feng Zijun, Zhang Longbing. The Tradeoff Cache Between Latency and Capacity in Chip MultiprocessorsJ. Journal of Computer Research and Development, 2009, 46(1): 167-175.

Citation:

Xiao Junhua, Feng Zijun, Zhang Longbing. The Tradeoff Cache Between Latency and Capacity in Chip MultiprocessorsJ. Journal of Computer Research and Development, 2009, 46(1): 167-175.

The Tradeoff Cache Between Latency and Capacity in Chip Multiprocessors

Graphical Abstract

Abstract

Abstract

Chip multiprocessors (CMP) have become the main stream microprocessor architecture. In CMP, the cache, especially the last level cache, is the critical part of its performance and becomes a focus of current research activities. CMP cache faces the conflicting requirements of satisfying both latency and capacity, and has to trade off between techniques that reduce off-chip and cross-chip misses. The private cache design minimizes the cache access latency but reduces the total effective cache capacity. The shared cache design maximizes the effective cache capacity but incurs long hit latency. In this paper, a CMP cache design (tradeoff cache between latency and capacity，TCLC) is proposed. TCLC is a private and shared hybrid design. TCLC can dynamically identify the cache blocks shared type and optimize them respectively. The private type is optimized through migration policy, the shared read-only type is optimized through replication policy, and the shared read-write type is optimized through center placement policy. TCLC tries to make cache access latency close to private design, and effective cache capacity close to shared design, which can mitigate the impact of the wire delay and reduce the average memory access latency. The experiment results indicate that this proposal performs 13.7% better than a private cache and 12% better than a shared cache.

FullText(HTML)

References (0)

Cited By

Turn off MathJax

Article Contents

The Tradeoff Cache Between Latency and Capacity in Chip Multiprocessors

Abstract

Catalog

Export File

Citation

Format

Content