Abstract:
Multi-core digital signal processors (MC-DSPs) are new multi-core processors for the emerging next generation of high performance embedded applications. Like multi-core general processors, MC-DSPs with memory-shared structures often suffer from the long access latency involved in cache coherency operations. Data speculation technology which mainly consists of data fetching and data forwarding is an efficient approach to hide this kind of access latency. Starting from exploring two important application features of MC-DSPs, a new data stream clustered forwarding (DSCF) technique is proposed for MC-DSPs with scalable memory-shared structures. DSCF uses its own data streams forwarding primitives inserted in the codes of DSP cores as producers to trigger a customized forwarding management units (FMU) to forward shared data streams to the local data buffers of DSP cores as consumers. The transmission process of shared data streams is controlled to be matched with their being consumed process, and the forwarded data streams are partitioned into multi clusters to transmit. DSCF method is compatible with basic shared memory cache coherency protocols, and has lower hardware overhead, no pollution to destination DSP caches, well matched transmission speed and improved structure scalability. The simulation with several typical DSP benchmarks shows that DSCF can reduce the miss ratio of the MC-DSP cache coherency by 44% on average, and improve the overall system performance by 30% to 70%.