高级检索

    面向服务器应用的远距离函数调用指令预取优化

    Long-distance function call instruction prefetching optimization for server applications

    • 摘要: 一级指令缓存缺失导致的大取指延迟是制约现代处理器性能进一步提升的重要瓶颈之一,尤其在大指令踪迹的服务器应用上更是如此。指令预取技术是解决这一问题的关键性技术,它通过提前将要用到的指令块放入上级缓存中,从而达到掩盖高昂访问延迟时间的目的。近年来,研究者们提出了许多指令预取架构来缓解该问题,但由于指令局部性较差,长距离函数调用仍然带来了大量的指令缺失。本文设计了一种新的指令预取机制,能以较低的硬件开销实现对函数调用目标指令的高覆盖率和高准确率预取。实验表明,应用本文的优化后,函数调用目标指令缺失率较目前最先进的指令预取器1降低约45%,IPC(Instruction Per Cycle)性能比基准线高约11.9%,比相似开销的目前最先进的指令预取器高出约2.9%。

       

      Abstract: The large instruction fetch delay caused by the lack of L1 instruction cache is one of the most important bottlenecks that restricts the performance development of modern processors, especially in server applications with very large instruction traces. Instruction prefetching technology is a key technology to solve this problem, which achieves the purpose of masking the high access latency by putting the instruction blocks to be used in I-cache in advance. In recent years, researchers have proposed many instruction prefetching architectures to alleviate this problem, but due to poor instruction locality, long-distance function calls still bring a large number of missing instructions. In this paper, a new instruction prefetching mechanism is designed, which can achieve high coverage and high accuracy prefetching of the target instruction for function calls with low hardware overhead. Experiments show that after applying the optimization in this paper, the missing rate of the function call target instruction is about 45% lower than that of the current state-of-the-art instruction prefetcher 1, and the IPC performance is about 11.9% higher than the baseline, and about 2.9% higher than the current state-of-the-art instruction prefetcher with similar overhead.

       

    /

    返回文章
    返回