ISSN 1000-1239 CN 11-1777/TP

计算机研究与发展 ›› 2016, Vol. 53 ›› Issue (2): 390-398.doi: 10.7544/issn1000-1239.2016.20148039

• 系统结构 • 上一篇    下一篇

一种并行计算机互连网络中的地址转换Cache

张建民,黎铁军,李思昆   

  1. (国防科学技术大学计算机学院 长沙 410073) (jmzhang@nudt.edu.cn)
  • 出版日期: 2016-02-01
  • 基金资助: 
    国家自然科学基金项目(61103083,61133007);国家“八六三”高技术研究发展计划基金项目(2012AA01A301);国家“九七三”重点基础研究发展计划基金项目(2011CB309705)

An Address Cache of Interconnect Network in Parallel Computers

Zhang Jianmin, Li Tiejun, Li Sikun   

  1. (College of Computer, National University of Defense Technology, Changsha 410073)
  • Online: 2016-02-01

摘要: 当前在大规模并行计算机中,多数并行程序的用户习惯于使用虚拟地址进行编程.因此,虚拟地址与物理地址之间的转换效率直接影响了并行程序的执行性能,而cache能够有效地提高虚实地址转换的效率并降低延迟.提出了一种在大规模并行计算机互连网络中的地址转换cache.它采用了嵌入式DRAM(embedded dynamic random access memory, eDRAM)存储器,容纳更多的地址转换表项,从而提高命中率.并设计一种eDRAM刷新机制,隐藏了刷新操作,避免刷新导致的性能损失.ATC(address translation cache)中实现了诸如纠错码与旁路机制等多种可靠性设计.在32个计算结点上运行业界公认的NPB测试程序,结果显示32个结点中ATC的平均命中率达到了95.3%,表明ATC设计的正确性与高性能.并且通过与3种传统SRAM(static random access memory)实现的cache进行对比实验,说明了cache容量是提高命中率的关键因素.

关键词: 并行计算机, 互连网络, 虚拟地址, 物理地址, 地址转换cache

Abstract: Most of users are accustomed to utilize the virtual address in their parallel programs running at the scalable parallel computer systems. Therefore the virtual and physical address translation directly affects the performance of the parallel programs. Cache can strongly improve the efficiency of address translation and reduce the latency of translation. In this paper, a new address translation cache (ATC) is proposed for the interconnect network of scalable parallel computer systems. To improve the hit ratio, ATC adopts embedded dynamic random access memory (eDRAM) to store more address translation table items. A new eDRAM refresh mechanism is proposed to hide the refresh operation and avoid the performance loss introduced by refresh. In ATC, there are many reliability techniques, including error correcting code and a novel bypass module. The well-known NPB benchmarks have been run at the 32 compute nodes including ATC. The results show that the ATC has high hit ratio which the average value of 32 nodes is 95.3%. It is indicated that ATC is well designed and has high performance. It also has been compared with three types of typical cache implemented by different capacities SRAM (static random access memory), and the conclusion is the capacity of cache is key factor to improve the hit ratio.

Key words: parallel computer, interconnect network, virtual address, physical address, address translation cache

中图分类号: