快速地址计算的自适应栈高速缓存

郇丹丹; 李祖松; 王  剑; 章隆兵; 胡伟武; 刘志勇

快速地址计算的自适应栈高速缓存

Adaptive Stack Cache with Fast Address Generation

摘要

摘要: 随着存储系统的访问速度与处理器运算速度的差距越来越显著，访存性能已成为提高处理器性能的瓶颈.通过对程序的访存行为进行分析，提出快速地址计算的自适应栈高速缓存方案.该方案将栈访问从数据高速缓存的访问中分离出来，充分利用栈空间数据访问的特点，提高指令级并行度，减少数据高速缓存污染，降低数据高速缓存失效率，并采用快速地址计算策略，减少栈访问的命中时间.该栈高速缓存在发生栈溢出时能够自适应地关闭，以避免栈切换对处理器性能的影响.栈高速缓存标志中增加进程标识，进程切换时不需要将数据写到低层存储系统中，适用于多进程环境. SPEC CPU2000程序运行结果表明，采用快速地址计算的自适应栈高速缓存方案，25.8%的访存指令可以并行执行，数据高速缓存失效率平均降低9.4%，IPC值平均提高6.9%.

Abstract: With the processor-memory performance gap continuing to grow, the performance of memory access becomes the major bottleneck of the performance improvement for modern microprocessors. Adaptive stack cache with fast address generation policy is proposed by investigating memory access behavior of programs. Adaptive stack cache with fast address generation policy decouples stack references from other data references, improves instruction-level parallelism, reduces data cache pollution and decreases data cache miss rate. Stack access latency can be reduced by using fast address generation scheme proposed here. Adaptive stack cache with fast address generation policy can also avoid unnecessary memory traffic. Stack cache can be disabled adaptively, when it is overflown. It can also be applied to multithread scheme by adding thread identifier. The results obtained indicate that about 25.8% of all memory reference instructions in SPEC CPU2000 benchmarks are executed in parallel by adopting adaptive stack cache with fast address generation. On average 9.4% data cache miss is reduced. The performance is improved significantly. The average IPC speedup is 6.9%.

HTML全文

参考文献(0)

施引文献

资源附件(0)