高级检索
    吕慧伟, 程 元, 白 露, 陈明宇, 范东睿, 孙凝晖. 众核处理器和众核集群的并行模拟[J]. 计算机研究与发展, 2013, 50(5): 1110-1117.
    引用本文: 吕慧伟, 程 元, 白 露, 陈明宇, 范东睿, 孙凝晖. 众核处理器和众核集群的并行模拟[J]. 计算机研究与发展, 2013, 50(5): 1110-1117.
    Lü Huiwei, Cheng Yuan, Bai Lu, Chen Mingyu, Fan Dongrui, Sun Ninghui. Parallel Simulation of Many-Core Processor and Many-Core Clusters[J]. Journal of Computer Research and Development, 2013, 50(5): 1110-1117.
    Citation: Lü Huiwei, Cheng Yuan, Bai Lu, Chen Mingyu, Fan Dongrui, Sun Ninghui. Parallel Simulation of Many-Core Processor and Many-Core Clusters[J]. Journal of Computer Research and Development, 2013, 50(5): 1110-1117.

    众核处理器和众核集群的并行模拟

    Parallel Simulation of Many-Core Processor and Many-Core Clusters

    • 摘要: 模拟器是计算机体系结构研究的重要工具.近年来并行计算机体系结构的发展给计算机模拟带来了巨大的挑战.一方面,随着体系结构朝着多核以及众核处理器发展,模拟的目标系统规模随着模拟核数以摩尔定律的速度增加而不断增大;另一方面,串行模拟的速度因为模拟器运行所在宿主机主频提速减缓而停滞不前.上述两方面的原因使得传统的串行模拟方式无法满足对新兴体系结构模拟规模和速度的需求.以众核处理器和众核集群这两种体系结构为例,并行模拟技术在并行计算机体系结构模拟中是必要而且可行的.对于众核处理器的模拟,使用并行离散事件模拟对其进行加速,在模拟精度不变的前提下,提高模拟速度10.9倍.对于众核集群的模拟,模拟的目标系统总规模达到1024核,并且支持MPI/Pthreads混合编程的运行环境.

       

      Abstract: Computer architecture simulator is an important tool for computer architecture researchers. Recent development of parallel architectures bring great challenge to computer simulations. On the target side, as processors move towards multi-core and many-core, the complexity of the target system is doubling in the speed of Moore’s law as the simulated target core number grows; on the host side, the speed of sequential simulation is halted as the speed of a single host processor halts. Due to the above two reasons, sequential simulation could no longer meet the challenge of new parallel architectures. In this paper, we will describe the necessity and feasibility of parallel simulation for parallel computer architectures using two examples: a many-core processor simulator and a many-core cluster simulator. For many-core processor simulator, we use parallel discrete event simulation (PDES) to speed it up 10.9 times without accuracy lost. For many-core cluster simulation, we simulated a cluster at 1024-core scale, with MPI/Pthreads runtime support.

       

    /

    返回文章
    返回