高级检索
    汪 东 陈书明. DSCF:一种面向共享存储多核DSP的数据流分簇前向技术[J]. 计算机研究与发展, 2008, 45(8): 1446-1553.
    引用本文: 汪 东 陈书明. DSCF:一种面向共享存储多核DSP的数据流分簇前向技术[J]. 计算机研究与发展, 2008, 45(8): 1446-1553.
    Wang Dong and Chen Shuming. DSCF: Data Streams Clustered Forwarding for Multi-Core DSPs with Memories Shared[J]. Journal of Computer Research and Development, 2008, 45(8): 1446-1553.
    Citation: Wang Dong and Chen Shuming. DSCF: Data Streams Clustered Forwarding for Multi-Core DSPs with Memories Shared[J]. Journal of Computer Research and Development, 2008, 45(8): 1446-1553.

    DSCF:一种面向共享存储多核DSP的数据流分簇前向技术

    DSCF: Data Streams Clustered Forwarding for Multi-Core DSPs with Memories Shared

    • 摘要: 多核数字信号处理器(DSP)的性能常常受限于共享存储的长延迟Cache一致性访问.数据前向(forwarding)技术是隐藏长延迟访问的一种有效手段.根据多核DSP应用的两类重要特征,提出了一种面向共享存储多核DSP结构的数据流分簇前向技术DSCF(data stream clustered forwarding).DSCF方法的主要特点是:兼容基本的共享存储Cache一致性协议;不污染目标Cache;数据的传输速度能够与消费速度相匹配;系统结构的可扩展性好.典型测试程序的模拟评测表明,采用DSCF方法能够将Cache一致性失效率平均降低44%,将系统总体性能提升30%~70%.

       

      Abstract: Multi-core digital signal processors (MC-DSPs) are new multi-core processors for the emerging next generation of high performance embedded applications. Like multi-core general processors, MC-DSPs with memory-shared structures often suffer from the long access latency involved in cache coherency operations. Data speculation technology which mainly consists of data fetching and data forwarding is an efficient approach to hide this kind of access latency. Starting from exploring two important application features of MC-DSPs, a new data stream clustered forwarding (DSCF) technique is proposed for MC-DSPs with scalable memory-shared structures. DSCF uses its own data streams forwarding primitives inserted in the codes of DSP cores as producers to trigger a customized forwarding management units (FMU) to forward shared data streams to the local data buffers of DSP cores as consumers. The transmission process of shared data streams is controlled to be matched with their being consumed process, and the forwarded data streams are partitioned into multi clusters to transmit. DSCF method is compatible with basic shared memory cache coherency protocols, and has lower hardware overhead, no pollution to destination DSP caches, well matched transmission speed and improved structure scalability. The simulation with several typical DSP benchmarks shows that DSCF can reduce the miss ratio of the MC-DSP cache coherency by 44% on average, and improve the overall system performance by 30% to 70%.

       

    /

    返回文章
    返回