ISSN 1000-1239 CN 11-1777/TP

计算机研究与发展

• 体系结构 •    

龙芯指令系统架构技术

胡伟武1,2,3,汪文祥3,吴瑞阳3,王焕东3,曾露3,徐成华3,高翔3,张福新1,2,4   

  1. 1 (中国科学院计算技术研究所  北京100190)

    2 (中国科学院大学  北京 100190)

    3 (龙芯中科技术股份有限公司  北京100095) 4 (处理器芯片全国重点实验室((中国科学院计算技术研究所北京 100190)

Loongson Instruction Set Architecture Technology

Hu Weiwu1,2,3, Wang Wenxiang3, Wu Ruiyang3, Wang Huandong3, Zeng Lu3, Xu Chenghua3, Gao Xiang3, and Zhang Fuxin1,2,4   

  1. 1 (Institute of Computing Technology, Beijing 100190)

    2 (University of Chinese Academy of Sciences, Beijing 100190)

    3 (Loongson Technology Corp., Beijing 100190) 4 (State Key Laboratory of Processors(Institute Of Computing Technology,Chinese Academy of Sciences), Beijing 100190)

摘要: 介绍了统筹考虑先进性和兼容性要求的龙芯指令系统架构——龙架构(LoongArch)。龙架构吸纳了近年来指令系统设计领域诸多先进的技术发展成果,易于高性能低功耗的实现和编译优化;融合了各国际主流指令系统的主要功能特性,不仅能够确保现有龙芯电脑上应用二进制的无损迁移,而且能够实现多种国际主流指令系统的高效二进制翻译。龙架构已经被实现于龙芯中科技术股份有限公司研制的3A5000 4核CPU。SPEC CPU2006的实验结果表明,在相同微结构下,龙架构性能比龙芯CPU原指令系统MIPS平均提升超过7%。在硬件辅助支持下,SPEC CPU2000程序从MIPS翻译到LoongArch可以实现无损翻 PAGE   \* MERGEFORMAT 2译,其定点程序子集和浮点程序子集从X86翻译到LoongArch的效率分布达QEMU二进制翻译器的2.8和37.6倍。龙架构有望消除指令系统之间的壁垒,使得不同指令集的软件能够融合到统一的龙架构平台上,不加区别地高效运行。

关键词: 龙芯CPU, MIPS架构, 龙架构, 二进制翻译, 兼容, 软件生态

Abstract: In this paper, the Loongson instruction set architecture (LoongArch) is introduced, which takes care of both advancement and software compatibility. LoongArch absorbs new features of recent ISA development to improve performance and reduce power consumption. New instructions, runtime environments, system states are added to LoongArch to accelerate binary translation from X86, ARM and MIPS binary code to LoongArch binary code. Binary translation systems are built on top of LoongArch to run MIPS Linux applications, X86 Linux and Windows applications and ARM Android applications. The LoongArch is implemented in the LS3A5000 four-core CPU product of Loongson Technology Corporation Limited. Performance evaluation of SPEC CPU2006 with the LS3A5000 and its FPGA system shows that, with the same micro-architecture, LoongArch performs on average 7% better than MIPS. With the hardware support, the binary translation from MIPS to LoongArch can be done without performance loss, and that from X86 to LoongArch performs 2.8(int) and 37.6(fp) times better than the QEMU system. LoongArch has the potential to remove the barrier between different ISAs and provides a unified platform for a new ecosystem.

Key words:  Loongson CPU, MIPS architecture, LoongArch architecture; binary translation, compatibility, software eco-system

中图分类号: