高级检索
    张建民, 黎铁军, 李思昆. 一种并行计算机互连网络中的地址转换Cache[J]. 计算机研究与发展, 2016, 53(2): 390-398. DOI: 10.7544/issn1000-1239.2016.20148039
    引用本文: 张建民, 黎铁军, 李思昆. 一种并行计算机互连网络中的地址转换Cache[J]. 计算机研究与发展, 2016, 53(2): 390-398. DOI: 10.7544/issn1000-1239.2016.20148039
    Zhang Jianmin, Li Tiejun, Li Sikun. An Address Cache of Interconnect Network in Parallel Computers[J]. Journal of Computer Research and Development, 2016, 53(2): 390-398. DOI: 10.7544/issn1000-1239.2016.20148039
    Citation: Zhang Jianmin, Li Tiejun, Li Sikun. An Address Cache of Interconnect Network in Parallel Computers[J]. Journal of Computer Research and Development, 2016, 53(2): 390-398. DOI: 10.7544/issn1000-1239.2016.20148039

    一种并行计算机互连网络中的地址转换Cache

    An Address Cache of Interconnect Network in Parallel Computers

    • 摘要: 当前在大规模并行计算机中,多数并行程序的用户习惯于使用虚拟地址进行编程.因此,虚拟地址与物理地址之间的转换效率直接影响了并行程序的执行性能,而cache能够有效地提高虚实地址转换的效率并降低延迟.提出了一种在大规模并行计算机互连网络中的地址转换cache.它采用了嵌入式DRAM(embedded dynamic random access memory, eDRAM)存储器,容纳更多的地址转换表项,从而提高命中率.并设计一种eDRAM刷新机制,隐藏了刷新操作,避免刷新导致的性能损失.ATC(address translation cache)中实现了诸如纠错码与旁路机制等多种可靠性设计.在32个计算结点上运行业界公认的NPB测试程序,结果显示32个结点中ATC的平均命中率达到了95.3%,表明ATC设计的正确性与高性能.并且通过与3种传统SRAM(static random access memory)实现的cache进行对比实验,说明了cache容量是提高命中率的关键因素.

       

      Abstract: Most of users are accustomed to utilize the virtual address in their parallel programs running at the scalable parallel computer systems. Therefore the virtual and physical address translation directly affects the performance of the parallel programs. Cache can strongly improve the efficiency of address translation and reduce the latency of translation. In this paper, a new address translation cache (ATC) is proposed for the interconnect network of scalable parallel computer systems. To improve the hit ratio, ATC adopts embedded dynamic random access memory (eDRAM) to store more address translation table items. A new eDRAM refresh mechanism is proposed to hide the refresh operation and avoid the performance loss introduced by refresh. In ATC, there are many reliability techniques, including error correcting code and a novel bypass module. The well-known NPB benchmarks have been run at the 32 compute nodes including ATC. The results show that the ATC has high hit ratio which the average value of 32 nodes is 95.3%. It is indicated that ATC is well designed and has high performance. It also has been compared with three types of typical cache implemented by different capacities SRAM (static random access memory), and the conclusion is the capacity of cache is key factor to improve the hit ratio.

       

    /

    返回文章
    返回