Citation: | Shu Yanjun, Zheng Xiangyu, Xu Chenghua, Huang Pei, Wang Yongqi, Zhou Fan, Zhang Zhan, Zuo Decheng. GCC Optimization for LoongArch Memory Accessing Instructions with Bound-Checking[J]. Journal of Computer Research and Development, 2025, 62(5): 1136-1150. DOI: 10.7544/issn1000-1239.202440100 |
LoongArch ISA (instruction set architecture) introduces new memory accessing instructions with bound-checking to decrease the overhead of memory security check. However, as a new type of memory accessing instruction, the existing GCC (GNU compiler collection) compiler tools cannot support it and thus LoongArch based hardware remains underutilized. Therefore, in this paper, we revise the GCC compiler with the LoongArch memory accessing instructions to optimize the memory security check. Specifically, our work is divided into three parts: 1) designing built-in functions for the memory accessing instructions; 2) improving the RTL (register transfer language) optimizer of GCC to recognize two kinds of semantic patterns of memory accessing instructions with bound-checking, which are non-exception handling and exception handling; 3) implementing a new exception signal SIGBCE for the bound check exception BCE that is raised by CPU in Linux kernel, and implementing the corresponding signal handling function in glibc (GNU C library) to deal with the bound check exception. The experiments on GCC 12.2.0 and Loongson 3C5000L server show that the revised compiler is able to correctly employ the new memory accessing instructions and bring an acceleration of approximately 20% in some security routines. Our work improves the ecosystem of LoongArch and boosts the development of LoongArch ISA. It will also be referential to GCC optimization for the specialized instructions.
[1] |
蒋卫华,李伟华,杜君. 缓冲区溢出攻击:原理,防御及检测[J]. 计算机工程,2003,29(10):5−7 doi: 10.3969/j.issn.1000-3428.2003.10.003
Jiang Weihua, Li Weihua, Du Jun. Buffer overflow attack: Theory, recovery and detection[J]. Computer Engineering, 2003, 29(10): 5−7 (in Chinese) doi: 10.3969/j.issn.1000-3428.2003.10.003
|
[2] |
李亚伟,章隆兵,张福新,等. 基于软硬协同的程序运行时安全保护机制[J]. 计算机学报,2023,46(1):180−201 doi: 10.11897/SP.J.1016.2023.00180
Li Yawei, Zhang Longbing, Zhang Fuxin, et al. A security protection mechanism on program runtime based on software and hardware cooperation[J]. Chinese Journal of Computers, 2023, 46(1): 180−201 (in Chinese) doi: 10.11897/SP.J.1016.2023.00180
|
[3] |
Otterstad C W. A brief evaluation of Intel®MPX[C]//Proc of the 9th Annual IEEE Systems Conf. Piscataway, NJ: IEEE, 2015: 1−7
|
[4] |
Serebryany K. ARM memory tagging extension and how it improves C/C++ memory safety[J]. Login The USENIX Magazine, 2019, 44(2): 12−16
|
[5] |
胡伟武,汪文祥,吴瑞阳,等. 龙芯指令系统架构技术[J]. 计算机研究与发展,2023,60(1):2−16 doi: 10.7544/issn1000-1239.202220196
Hu Weiwu, Wang Wenxiang, Wu Ruiyang, et al. Loongson instruction set architecture technology[J]. Journal of Computer Research and Development, 2023, 60(1): 2−16 (in Chinese) doi: 10.7544/issn1000-1239.202220196
|
[6] |
Patterson D A, Ditzel D R. The case for the reduced instruction set computer[J]. SIGARCH Computer Architecture News, 1980, 8(6): 25−33 doi: 10.1145/641914.641917
|
[7] |
GCC Team. GCC wiki [EB/OL]. [2024-02-05]. https://gcc.gnu.org/wiki/Intel MPX support in the GCC compiler
|
[8] |
Clang Team. Clang documentation [EB/OL]. [2024-02-05]. https://clang.llvm.org/docs/HardwareAssistedAddressSanitizerDesign.html
|
[9] |
Loongson Technology Corporation. LoongArch architecture manual [EB/OL]. [2024-02-05]. https://www.loongson.cn/system/loongarch
|
[10] |
Diego N. Tree SSA ― A new high-level optimization framework for the GNU compiler collection[C/OL]//Proc of the 5th NordU/USENIX Users Conf. Berkeley, CA: USENIX Association, 2003[2024-06-15]. https://www.airs.com/dnovillo/Papers/nordu2003.pdf
|
[11] |
王文义,武华北. Linux中进程间信号通信机制的分析及其应用[J]. 计算机工程与应用,2005,41(3):108−110,115 doi: 10.3321/j.issn:1002-8331.2005.03.035
Wang Wenyi, Wu Huabei. Analysis and application of the signal communication mechanism of Linux[J]. Computer Engineering and Applications, 2005, 41(3): 108−110, 115 (in Chinese) doi: 10.3321/j.issn:1002-8331.2005.03.035
|
[12] |
Oleksenko O, Kuvaiskii D, Bhatotia P, et al. Intel MPX explained: A cross-layer analysis of the Intel MPX system stack[J]. Proceedings of the ACM on Measurement and Analysis of Computing Systems, 2018, 2(2): 28: 1−28: 30
|
[13] |
Rigger M, Marr S, Kell S, et al. An analysis of x86−64 inline assembly in C programs[C]//Proc of the 14th ACM SIGPLAN/SIGOPS Int Conf on Virtual Execution Environments. New York: ACM, 2018: 84−99
|
[14] |
赵翔,贾海鹏,张云泉,等. 基于ARMv8处理器的实数FFT实现与性能优化研究[J]. 计算机学报,2023,46(5):1003−1018 doi: 10.11897/SP.J.1016.2023.01003
Zhao Xiang, Jia Haipeng, Zhang Yunquan, et al. Real FFT implementation and performance optimization based on ARMv8 CPUs[J]. Chinese Journal of Computers, 2023, 46(5): 1003−1018 (in Chinese) doi: 10.11897/SP.J.1016.2023.01003
|
[15] |
Cai Lulu, Wang Yagang, Chen Xiaolong. Glibc hot spot function assembly optimization for LoongArch[C]//Proc of the 41st Chinese Control Conf. Piscataway, NJ: IEEE, 2022: 2053−2058
|
[16] |
杨昊,刘哲,黄军浩,等. AKCN-MLWE算法AVX2高效实现[J]. 计算机学报,2021,44(12):2560−2572 doi: 10.11897/SP.J.1016.2021.02560
Yang Hao, Liu Zhe, Huang Junhao, et al. High-speed AVX2 implementation of AKCN-MLWE[J]. Chinese Journal of Computers, 2021, 44(12): 2560−2572 (in Chinese) doi: 10.11897/SP.J.1016.2021.02560
|
[17] |
赵龙,韩文报,杨宏志. 基于SIMD指令的ECC攻击算法研究[J]. 计算机研究与发展,2012,49(7):1553−1559
Zhao Long, Han Wenbao, Yang Hongzhi. Research on ECC attacking algorithm based on SIMD instructions[J]. Journal of Computer Research and Development, 2012, 49(7): 1553−1559 (in Chinese)
|
[18] |
沈洁,龙标,姜浩,等. 飞腾处理器上向量三角函数的设计实现与优化[J]. 计算机研究与发展,2020,57(12):2610−2620 doi: 10.7544/issn1000-1239.2020.20190721
Shen Jie, Long Biao, Jiang Hao, et al. Implementation and optimization of vector trigonometric functions on Phytium processors[J]. Journal of Computer Research and Development, 2020, 57(12): 2610−2620 (in Chinese) doi: 10.7544/issn1000-1239.2020.20190721
|
[19] |
Rigger M, Marr S, Adams B, et al. Understanding GCC builtins to develop better tools[C]//Proc of the ACM Joint Meeting on European Software Engineering Conf and Symp on the Foundations of Software Engineering, 18th ESEC/27th SIGSOFT FSE 2019. New York: ACM, 2019: 74−85
|
[20] |
Chen Xiaolong, Wang Yagang, Cai Lulu. GCC built-in function mechanism analysis and LoongArch-based implementation[C]//Proc of the 41st Chinese Control Conf. Piscataway, NJ: IEEE, 2022: 2046−2052
|
[21] |
Koppelmann B, Adelt P, Mueller W, et al. RISC-V extensions for bit manipulation instructions[C]//Proc of the 29th Int Symp on Power and Timing Modeling, Optimization and Simulation. Piscataway, NJ: IEEE, 2019: 41−48
|
[22] |
Babu P S, Sivaraman S, Sarma D N, et al. Evaluation of bit manipulation instructions in optimization of size and speed in RISC-V[C]//Proc of the 34th Int Conf on VLSI Design and 20th Int Conf on Embedded Systems. Piscataway, NJ: IEEE, 2021: 54−59
|
[23] |
Levy M, Olson R. Autovectorization for GCC compiler[J]. Electrical Design News: The Magazine of the Electronics Industry, 2007, 52(15): 69−70, 72, 74
|
[24] |
姜伟华,梅超,郭一,等. 一种针对多媒体扩展指令集和实际多媒体程序的自动向量化方法[J]. 计算机学报,2005,28(8):1255−1266 doi: 10.3321/j.issn:0254-4164.2005.08.002
Jiang Weihua, Mei Chao, Guo Yi, et al. Vectorization for real-life multimedia applications on processors’ multimedia extensions[J]. Chinese Journal of Computers, 2005, 28(8): 1255−1266 (in Chinese) doi: 10.3321/j.issn:0254-4164.2005.08.002
|
[25] |
冯竞舸,贺也平,陶秋铭,等. 基于多种同构化变换的SLP向量化方法[J]. 计算机研究与发展,2023,60(12):2907−2927 doi: 10.7544/issn1000-1239.202220354
Feng Jingge, He Yeping, Tao Qiuming, et al. SLP vectorization method based on multiple isomorphic transformations[J]. Journal of Computer Research and Development, 2023, 60(12): 2907−2927 (in Chinese) doi: 10.7544/issn1000-1239.202220354
|
[26] |
田祖伟,孙光. 基于谓词代码的编译优化技术研究[J]. 计算机科学,2010,37(5):130−133,138 doi: 10.3969/j.issn.1002-137X.2010.05.031
Tian Zuwei, Sun Guang. Research of compiler optimization technology based on predicated code[J]. Computer Science, 2010, 37(5): 130−133,138 (in Chinese) doi: 10.3969/j.issn.1002-137X.2010.05.031
|
[27] |
王凤芹,胡定磊,刘春林. 一种基于谓词执行优化技术的寄存器分配算法[J]. 计算机研究与发展,2006,43(8):1471−1476 doi: 10.1360/crad20060824
Wang Fengqin, Hu Dinglei, Liu Chunlin. A register allocation algorithm for predicated code[J]. Journal of Computer Research and Development, 2006, 43(8): 1471−1476 (in Chinese) doi: 10.1360/crad20060824
|
[1] | Jin Dongming, Jin Zhi, Chen Xiaohong, Wang Chunhui. ChatModeler: A Human-Machine Collaborative and Iterative Requirements Elicitation and Modeling Approach via Large Language Models[J]. Journal of Computer Research and Development, 2024, 61(2): 338-350. DOI: 10.7544/issn1000-1239.202330746 |
[2] | Wang Juanjuan, Wang Hongan. Multi-Agent Multi-Criticality Scheduling Based Self-Healing System of Power Grid[J]. Journal of Computer Research and Development, 2017, 54(4): 720-730. DOI: 10.7544/issn1000-1239.2017.20161026 |
[3] | He Wenbin, Liu Qunfeng, Xiong Jinzhi. The Error Theory of Polynomial Smoothing Functions for Support Vector Machines[J]. Journal of Computer Research and Development, 2016, 53(7): 1576-1585. DOI: 10.7544/issn1000-1239.2016.20148462 |
[4] | He Wangquan, Wei Di, Quan Jianxiao, Wu Wei, Qi Fengbin. Dynamic Task Scheduling Model and Fault-Tolerant via Queuing Theory[J]. Journal of Computer Research and Development, 2016, 53(6): 1271-1280. DOI: 10.7544/issn1000-1239.2016.20148445 |
[5] | Zhao Yu, Wang Yadi, Han Jihong, Fan Yudan, and Zhang Chao. A Formal Model for Cryptographic Protocols Based on Planning Theory[J]. Journal of Computer Research and Development, 2008, 45(9). |
[6] | Shi Jin, Lu Yin, and Xie Li. Dynamic Intrusion Response Based on Game Theory[J]. Journal of Computer Research and Development, 2008, 45(5): 747-757. |
[7] | Li Ye, Cai Yunze, Yin Rupo, Xu Xiaoming. Support Vector Machine Ensemble Based on Evidence Theory for Multi-Class Classification[J]. Journal of Computer Research and Development, 2008, 45(4): 571-578. |
[8] | Lin Jianning, Wu Huizhong. Research on a Trust Model Based on the Subjective Logic Theory[J]. Journal of Computer Research and Development, 2007, 44(8): 1365-1370. |
[9] | He Lijian and Zhang Wei. An Agent Organization Structure for Solving DCOP Based on the Partitions of Constraint Graph[J]. Journal of Computer Research and Development, 2007, 44(3). |
[10] | Mu Kedian and Lin Zuoquan. Symbolic Dempster-Shafer Theory[J]. Journal of Computer Research and Development, 2005, 42(11): 1833-1842. |