Citation: | Zhao Ziwei, Tu Hang, Liu Qin, Li Li, Yu Tao. An Automated Code Generation of Instruction Set Implementation and Functional Testing for gem5[J]. Journal of Computer Research and Development, 2023, 60(7): 1678-1691. DOI: 10.7544/issn1000-1239.202220158 |
Computer system simulators are important tools for research and prototype development of embedded systems. For interpretation-based simulators, the decoding process of CPU models has an important effect on their performance. Therefore, improving the performance of the decoding process is one of the key problems of simulation efficiency. Besides, for instruction sets without a standard test suite (such as custom instructions), writing functional tests manually leads to low development efficiency. The instruction information required by functional tests is practically the same as the implementation of the decoding process. To solve the above problems, we propose a code generation method, which takes an instruction set description as input and outputs its implementation codes optimized for gem5 and its functional tests. Firstly, we extend the instruction set description language of gem5 and divide it into code description, function description, and test description. Secondly, we optimize the construction algorithm of decoding decision trees for gem5 and generate decoding codes, instruction codes, and functional test cases. Lastly, we take the Cortex-M3 instruction set as an example and compare our method with the original method of gem5. The total generation time is reduced by about 64%, the compiled executable code size is reduced by about 407 KB, the performance is improved by about 13%, and our method can improve the development efficiency.
[1] |
刘雨辰,王佳,陈云霁,等. 计算机系统模拟器研究综述[J]. 计算机研究与发展,2015,52(1):3−15 doi: 10.1177/0735633114568851
Liu Yuchen, Wang Jia, Chen Yunji, et al. Survey on computer system simulator[J]. Journal of Computer Research and Development, 2015, 52(1): 3−15 (in Chinese) doi: 10.1177/0735633114568851
|
[2] |
Bellard F. QEMU, a fast and portable dynamic translator[C] //Proc of the 16th USENIX Annual Technical Conf. Berkeley, CA: USENIX Association, 2005: 41–46
|
[3] |
Binkert N, Beckmann B, Black G, et al. The gem5 simulator[J]. SIGARCH Computer Architecture News, 2011, 39(2): 1−7 doi: 10.1145/2024716.2024718
|
[4] |
Krishna R, Austin T. Efficient software decoder design[C] //Proc of the 1st Workshop on Binary Translation. New York: ACM, 2001: 21–30
|
[5] |
Harry W. From high level architecture descriptions to fast instruction set simulators[D]. Edinburgh: University of Edinburgh, 2015
|
[6] |
Lockhart D, Ilbeyi B, Batten C. Pydgin: Generating fast instruction set simulators from simple architecture descriptions with meta-tracing JIT compilers[C] //Proc of the 16th Int Symp on Performance Analysis of Systems and Software. Piscataway, NJ: IEEE, 2015: 256−267
|
[7] |
Zivojnovic V, Pees S, Meyr H. LISA-machine description language and generic machine model for HW/SW co-design[C] //Proc of the 5th Workshop on VLSI Signal Processing. Piscataway, NJ: IEEE, 1996: 127–136
|
[8] |
Hartoog M, Rowson J, Reddy P, et al. Generation of software tools from processor descriptions for hardware/software codesign[C] //Proc of the 34th Design Automation Conf. New York: ACM, 1997: 303−306
|
[9] |
Theiling H. Generating decision trees for decoding binaries[J]. ACM SIGPLAN Notices, 2001, 36(8): 112−120 doi: 10.1145/384196.384213
|
[10] |
Wei Qin, Malik S. Automated synthesis of efficient binary decoders for retargetable software toolkits[C] //Proc of the 40th Design Automation Conf. New York: ACM, 2003: 764–769
|
[11] |
Fournel N, Michel L, Pétrot F. Automated generation of efficient instruction decoders for instruction set simulators[C] //Proc of the 26th Int Conf on Computer-Aided Design. Piscataway, NJ: IEEE, 2013: 739−746
|
[12] |
Okuda K, Takeyama H. Decision tree generation for decoding irregular instructions[C] //Proc of the 19th Design, Automation & Test in Europe Conf & Exhibition. Piscataway, NJ: IEEE, 2016: 1592−1597
|
[13] |
Okuda K, Chiba S. Domain-specific programming assistance in an embedded DSL for generating processor emulators[C] //Proc of the 36th Annual ACM Symp on Applied Computing. New York: ACM, 2021: 1256–1264
|
[14] |
Tadros L. A cost model for decoder decision trees[C] //Proc of the 1st European Symp on Software Engineering. New York: ACM, 2020: 142–147
|
[15] |
Mishra P, Dutt N. Functional coverage driven test generation for validation of pipelined processors[C] //Proc of the 8th Design, Automation & Test in Europe Conf & Exhibition. Piscataway, NJ: IEEE, 2005: 678−683
|
[16] |
RISC-V International. riscv-tests[EB/OL]. [2022-01-14]. https://github.com/riscv-software-src/riscv-tests
|
[17] |
SCons Foundation. SCons: A software construction tool [EB/OL]. [2022-01-14]. https://scons.org/
|
[18] |
Beazley D. SLY (Sly Lex-Yacc) [EB/OL]. [2022-01-14]. https://github.com/dabeaz/sly
|
[19] |
Free and Open Source Silicon Foundation. Embench: A modern embedded benchmark suite [EB/OL]. [2022-01-14]. https://www.embench.org/
|
[1] | Wang Haotian, Ding Yan, He Xianhao, Xiao Guoqing, Yang Wangdong. SparseMode: A Sparse Compiler Framework for Efficient SpMV Vectorized Code Generation[J]. Journal of Computer Research and Development. DOI: 10.7544/issn1000-1239.202550139 |
[2] | Yan Zhiyuan, Xie Biwei, Bao Yungang. HVMS: A Hybrid Vectorization-Optimized Mechanism of SpMV[J]. Journal of Computer Research and Development, 2024, 61(12): 2969-2984. DOI: 10.7544/issn1000-1239.202330204 |
[3] | Feng Jingge, He Yeping, Tao Qiuming, Ma Hengtai. SLP Vectorization Method Based on Multiple Isomorphic Transformations[J]. Journal of Computer Research and Development, 2023, 60(12): 2907-2927. DOI: 10.7544/issn1000-1239.202220354 |
[4] | Li Xiaodan, Wu Wenling, Zhang Li. Efficient Search for Optimal Vector Permutations of uBlock-like Structures[J]. Journal of Computer Research and Development, 2022, 59(10): 2275-2285. DOI: 10.7544/issn1000-1239.20220485 |
[5] | Chen Yu, Liu Zhongjin, Zhao Weiwei, Ma Yuan, Shi Zhiqiang, Sun Limin. A Large-Scale Cross-Platform Homologous Binary Retrieval Method[J]. Journal of Computer Research and Development, 2018, 55(7): 1498-1507. DOI: 10.7544/issn1000-1239.2018.20180078 |
[6] | Li Junnan, Yang Xiangrui, Sun Zhigang. DrawerPipe: A Reconfigurable Packet Processing Pipeline for FPGA[J]. Journal of Computer Research and Development, 2018, 55(4): 717-728. DOI: 10.7544/issn1000-1239.2018.20170927 |
[7] | Zhao Jianghua, Mu Shuting, Wang Xuezhi, Lin Qinghui, Zhang Xi, Zhou Yuanchun. Crowdsourcing-Based Scientific Data Processing[J]. Journal of Computer Research and Development, 2017, 54(2): 284-294. DOI: 10.7544/issn1000-1239.2017.20160850 |
[8] | Luo Zhangqi, Huang Kun, Zhang Dafang, Guan Hongtao, Xie Gaogang. A Many-Core Processor Resource Allocation Scheme for Packet Processing[J]. Journal of Computer Research and Development, 2014, 51(6): 1159-1166. |
[9] | Wen Shuguang, Xie Gaogang. libpcap-MT: A General Purpose Packet Capture Library with Multi-Thread[J]. Journal of Computer Research and Development, 2011, 48(5): 756-764. |
[10] | Tian Daxin, Liu Yanheng, Li Yongli, Tang Yi. A Fast Matching Algorithm and Conflict Detection for Packet Filter Rules[J]. Journal of Computer Research and Development, 2005, 42(7): 1128-1135. |