高级检索
    高 翔, 章隆兵, 胡伟武. 一种基于容量复用的异构CMP Cache[J]. 计算机研究与发展, 2008, 45(5): 877-885.
    引用本文: 高 翔, 章隆兵, 胡伟武. 一种基于容量复用的异构CMP Cache[J]. 计算机研究与发展, 2008, 45(5): 877-885.
    Gao Xiang, Zhang Longbing, Hu Weiwu. A CapacityShared Heterogeneous CMP Cache[J]. Journal of Computer Research and Development, 2008, 45(5): 877-885.
    Citation: Gao Xiang, Zhang Longbing, Hu Weiwu. A CapacityShared Heterogeneous CMP Cache[J]. Journal of Computer Research and Development, 2008, 45(5): 877-885.

    一种基于容量复用的异构CMP Cache

    A CapacityShared Heterogeneous CMP Cache

    • 摘要: 多核环境下的Cache设计技术受到线延时和应用等多方面因素影响,私有和共享方案都存在各自的不足.提出了一种异构的CMP Cache结构,采用两类具有不同Cache层次的结点组成多核芯片,设计了基于间接索引的Cache容量复用等技术,提供了容量有效且访问迅速的片上存储层次.在全系统环境下对SPEC CPU2000, SPLASH2等程序的评测结果表明,异构CMP Cache结构能够适应各类应用的需要,对单进程和多线程应用平均性能提高分别可达16%和9%.异构CMP Cache同时具有硬件设计简单的特点,具有较好的工程可实现性,其设计思想将应用在未来的龙芯多核处理器设计中.

       

      Abstract: The characteristics of advanced integrated circuit technologies require architects to look for new ways to utilize large numbers of gates and mitigate the effects of high interconnect delays. Chip multiprocessors (CMPs) exploit increasing transistor counts by placing multiple processors on a single die. As the chip multiprocessors (CMPs) have become the trend of high performance microprocessors, the target workloads become more and more diversified. Due to the wire delay problem and diversity of applications, neither private nor shared caches can provide both large capacity and fast access in CMPs. A novel CMP cache design, the heterogeneous CMP cache (HCC) is presented, in which chips are constructed by tiles of two different categories. L2 caches of private tiles provide lowest hit latency and L2 cache of shared tiles increases the effective cache capacity for shared data. Incorporating indirectindex cache technology to share capacity between different hierarchies, HCC provide a both capacityeffective and access fast on chip memory subsystem. Detailed fullsystem simulations are used to analyze the HCC performance for various programs, including SPEC CPU2000, SPLASH2 and commercial workloads. The result shows that HCC improves performance by 16% for singlethreaded benchmarks and 9% for multithread benchmarks. HCC is easy to implement and the design ideas will be used in the future multicore processors of Godson series.

       

    /

    返回文章
    返回