高级检索
    王焕东, 高 翔, 陈云霁, 胡伟武. 龙芯3号互联系统的设计与实现[J]. 计算机研究与发展, 2008, 45(12): 2001-2010.
    引用本文: 王焕东, 高 翔, 陈云霁, 胡伟武. 龙芯3号互联系统的设计与实现[J]. 计算机研究与发展, 2008, 45(12): 2001-2010.
    Wang Huandong, Gao Xiang, Chen Yunji, Hu Weiwu. Interconnection of Godson-3 Multi-Core Processor[J]. Journal of Computer Research and Development, 2008, 45(12): 2001-2010.
    Citation: Wang Huandong, Gao Xiang, Chen Yunji, Hu Weiwu. Interconnection of Godson-3 Multi-Core Processor[J]. Journal of Computer Research and Development, 2008, 45(12): 2001-2010.

    龙芯3号互联系统的设计与实现

    Interconnection of Godson-3 Multi-Core Processor

    • 摘要: 龙芯3号的互联结构设计采用了一种基于二维Mesh的可伸缩分布式多核结构,可为芯片级、主板级和系统级的互联提供统一的拓扑结构和逻辑设计.龙芯3号的对外接口采用扩展的HyperTransport协议,既可以用于连接IO,又可以实现多芯片的互联.在龙芯3号的互联结构中还设置了软件路由配置机制,可以在板级直接构筑中等规模的CC-NUMA系统和更大规模的NCC-NUMA系统,提供高效的通信机制.介绍了基于龙芯3号的多处理器系统互联架构.采用了双层可伸缩互联结构:片内由二维Mesh连接多个结点,结点内由交叉开关连接多个处理器核和二级缓存模块.片间无需额外硬件支持即可通过支持缓存一致性的HyperTransport接口实现16核的多处理器系统.利用层次化目录技术,龙芯3号还可以支持更大规模的多处理器系统.龙芯3号的互联架构为搭建简洁、高效、灵活、高度可扩展的共享存储多处理器系统提供了有力支持.

       

      Abstract: The interconnection of Godson-3 multi-core processor adapts a 2D mesh based scalable distributed architecture, providing unified topology for chip level, board level, and system level design. HyperTransport protocol is implemented in Godson-3, both for IO connection and multi-chip interconnection. Software configurable routing arithmetic could provide efficient communication for board level CC-NUMA system or the NCC-NUMA system on a larger scale. Introduced in this paper the interconnecting network of Godson-3 multi-core processor. On the chip level, two-level scalable architecture is implemented inside Godson-3: a 2D mesh is used to connect all the nodes in the top level; two crossbars are used inside every node to connect 4 cores and 4 L2 caches, along with memory controllers and IO controllers. On the board level, the medium scale multi-chip system of CC-NUMA can be easily constituted by using cache coherence protocol supported HyperTransport interface interconnection. A larger scale multi-chip system can be constituted by using dedicated hardware for networking. Based on all these architectural supports, it becomes much easier to constitut an efficient and scalable shared memory multi-chip system.

       

    /

    返回文章
    返回