高级检索

    高性能众核处理器申威26010

    Shenwei-26010: A High-Performance Many-Core Processor

    • 摘要: 申威26010高性能众核处理器在多核处理器申威1600基础上,采用片上系统(system on chip, SoC)技术,在单芯片内集成4个运算控制核心和256个运算核心,采用自主设计的64位申威RISC(reduced instruction set computer)指令系统,支持256位SIMD(single instruction multiple data)整数和浮点向量加速运算,单芯片双精度浮点峰值性能达3.168TFLOPS.申威26010处理器基于28nm工艺流片,芯片die面积超过500mm\+2,芯片260个核心稳定运行频率达1.5GHz.申威26010处理器从结构级、微结构级到电路级,综合采用多种低功耗设计技术,峰值能效比达10.559GFLOPS/W.芯片运行频率和能效比均超过同时期国际同类型处理器.申威26010通过在高频率设计、稳定可靠性设计和成品率设计等方面的技术创新,有效解决了芯片在实现高性能目标中所遇到的高频率目标、功耗墙、稳定可靠性和成品率等难题,成功大规模应用于国产10万万亿次超级计算机系统“神威·太湖之光”,有效满足了科学与工程应用的计算需求.

       

      Abstract: Based on the multi-core processor Shenwei 1600, the high-performance many-core processor Shenwei 26010 adopts SoC (system on chip) technology, and integrates 4 computing-control cores and 256 computing cores in a single chip. It adopts a 64-bit RISC (reduced instruction set computer) instruction set designed with an original design, and supports 256-bit SIMD (single instruction multiple data) integer and floating-point vector-acceleration operations. Its peak performance for double precision floating-point operations reaches 3.168TFLOPS. Shenwei 26010 processor is manufactured using 28nm process technology. The die area of the chip is more than 500mm\+2, and the 260 cores of the chip can run stably with a frequency of 1.5GHz. Shenwei 26010 processor adopts a variety of low power-consumption designs on the architecture level, the microarchitecture level, and the circuit level, and thus, leading to a peak energy-efficiency-ratio of 10.559GFLOPS/W. Notably, both the operating frequency and the energy-efficiency-ratio of the chip are higher than those of the worldwide contemporary processor products. Through the technical innovations of high frequency design, stable reliability design and yield design, Shenwei 26010 has effectively solved the issues of high frequency target, power consumption wall, stability and reliability, and yield, all of which are encountered when pursuing the goal of high-performance computing. It has been applied successfully to a 100PFLOPS supercomputer system named “Sunway TaihuLight” on a large scale, and therefore, can adequately meet the computing requirements for both scientific and engineering applications.

       

    /

    返回文章
    返回