高级检索
    唐轶轩, 吴俊敏, 陈国良, 隋秀峰, 黄 景. 面向多线程程序基于效用的Cache优化策略[J]. 计算机研究与发展, 2013, 50(1): 170-180.
    引用本文: 唐轶轩, 吴俊敏, 陈国良, 隋秀峰, 黄 景. 面向多线程程序基于效用的Cache优化策略[J]. 计算机研究与发展, 2013, 50(1): 170-180.
    Tang Yixuan, Wu Junmin, Chen Guoliang, Sui Xiufeng, Huang Jing. A Utility Based Cache Optimization Mechanism for Multi-Thread Workloads[J]. Journal of Computer Research and Development, 2013, 50(1): 170-180.
    Citation: Tang Yixuan, Wu Junmin, Chen Guoliang, Sui Xiufeng, Huang Jing. A Utility Based Cache Optimization Mechanism for Multi-Thread Workloads[J]. Journal of Computer Research and Development, 2013, 50(1): 170-180.

    面向多线程程序基于效用的Cache优化策略

    A Utility Based Cache Optimization Mechanism for Multi-Thread Workloads

    • 摘要: 为了提供高速的数据访问,多核处理器常使用Cache划分机制来分配二级Cache资源,但传统的共享Cache划分算法大多是面向多道程序的,忽略了多线程负载中共享和私有数据访问模式的差别,使得共享数据的使用效率降低.提出了一种面向多线程程序的Cache 管理机制UPP,它通过监控Cache 中共享、私有数据的效用信息,为每个线程以及共享数据分配Cache 空间,使得各个线程以及共享数据的边际效用最大化,从而提高负载的整体性能.另外,UPP还考虑了程序中数据的使用频率以及临近性信息,通过提升、动态插入策略过滤低重用数据,从而使得高频数据块留在Cache中.通过实验表明,其性能相对于基于LRU的纯共享Cache结构和基于公平的静态Cache划分结构均有提升.

       

      Abstract: Modern multi-core processors usually employ shared level 2 cache to support fast data access among concurrent threads. However, under the pressure of high resource demand, the commonly used LRU policy may result in interferences among threads and degrades the overall performance. Partitioning the shared cache is a relatively flexible resource allocation method, but most previous partition approaches aimed at multi-programmed workloads and they ignored the difference between shared and private data access patterns of multi-threaded workloads, leading to the utility decrease of the shared data. Most traditional cache partitioning methods aim at single memory access pattern, and neglect the frequency and recency information of cachelines. In this paper, we study the access characteristics of private and shared data in multi-thread workloads, and propose a utility-based pseudo partition cache partitioning mechanism (UPP). UPP dynamically collects utility information of each thread and shared data, and takes the overall marginal utility as the metric of cache partitioning. Besides, UPP exploits both frequency and recency information of a workload simultaneously, in order to evict dead cachelines early and filter less reused blocks through dynamic insertion and promotion mechanism.

       

    /

    返回文章
    返回