Citation: | Zheng Jieyu, Song Zhenyu, Zhu Haoliang, Zhao Yunlei, Lin Jingqiang, Fan jin. Efficient Software Implementations of NTRU Lattice-Based Key Encapsulation Mechanisms[J]. Journal of Computer Research and Development. DOI: 10.7544/issn1000-1239.202440478 |
NTRU lattice is an important choice for building a practical post-quantum lattice-based key encapsulation mechanism. The software optimization engineering implementation of lattice cryptography is of great significance for the subsequent application deployment of post-quantum cryptography. CTRU is a lattice-based key encapsulation mechanism based on NTRU lattice proposed by Chinese scholars. At present, there only exists CTRU-768 scheme C and AVX2 implementation, and there is room for further optimization. In addition, the implementation of CTRU-768 cannot be directly extended to the CTRU-512 and CTRU-
[1] |
Rivest R L, Shamir A, Adleman L. A method for obtaining digital signatures and public-key cryptosystems[J]. Communications of the ACM, 1978, 21(2): 120−126 doi: 10.1145/359340.359342
|
[2] |
Koblitz N. Elliptic curve cryptosystems[J]. Mathematics of Computation, 1987, 48(177): 203−209 doi: 10.1090/S0025-5718-1987-0866109-5
|
[3] |
National Institute of Standards and Technology. Post-quantum cryptography call for proposals [EB/OL]. [2024-09-14]. https://csrc.nist.gov/Projects/post-quantum-cryptography/post-quantum-cryptography-standardization/Call-for-Proposals
|
[4] |
Avanzi R, Bos J, Ducas L, et al. CRYSTALS-Kyber algorithm specifications and supporting documentation[J]. NIST PQC Round, 2019, 2(4): 1−43
|
[5] |
Fouque P A, Kirchner P, Pornin T, et al. BAT: Small and fast KEM over NTRU lattices [J/OL]. IACR Transactions on Cryptographic Hardware and Embedded Systems, 2022: 240−265 [2024-09-14]. https://tches.iacr.org/index.php/TCHES/article/view/9487
|
[6] |
Zhang Jiang, Feng Dengguo, Yan Di. NEV: Faster and smaller NTRU encryption using vector decoding [C/OL]//Proc of ASIACRYPT 2023. Berlin: Springer, 2023: 157−189 [2024-09-14]. https://link.springer.com/chapter/10.1007/978-981-99-8739-9_6
|
[7] |
Jin Zhengzhong, Shen Shiyu, Zhao Yunlei. Compact and flexible KEM from ideal lattice[J]. IEEE Transactions on Information Theory, 2022, 68(6): 3829−3840 doi: 10.1109/TIT.2022.3148586
|
[8] |
Liang Zhichuang, Fang Boyue, Zheng Jieyu, et al. Compact and efficient KEMs over NTRU lattices[J]. Computer Standards & Interfaces, 2024, 89: 103828
|
[9] |
密码行业标准化技术委员会. 2023年度密码行业标准制修订任务(商用密码领域)[S]. 北京:密码行业标准化技术委员会,2023
Cryptography Standardization Technical Committee. 2023 Annual Encryption Industry Standard Formulation and Revision Tasks (Commercial Encryption Field) [S]. Beijing: Cryptography Standardization Technical Committee, 2023 (in Chinese)
|
[10] |
Intel Corporation. Intel instruction set architecture extensions [EB/OL]. [2024-09-14]. http://software.intel.com/en-us/isa-extensions
|
[11] |
Lomont C. Introduction to intel advanced vector extensions [EB/OL]. [2024-09-14]. https://hpc.llnl.gov/sites/default/files/intelAVXintro.pdf
|
[12] |
Reinders J R. Intel AVX−512 Instructions [EB/OL]. [2024-09-14]. https://www.intel.com/content/www/us/en/developer/articles/technical/intel-avx-512-instructions.html
|
[13] |
Bakos J D. Embedded Systems: ARM Programming and Optimization [M]. Amsterdam: Elsevier, 2023
|
[14] |
Dang V B, Mohajerani K, Gaj K. High-speed hardware architectures and FPGA benchmarking of CRYSTALS-Kyber, NTRU, and Saber[J]. IEEE Transactions on Computers, 2022, 72(2): 306−320
|
[15] |
El Khatib R, Azarderakhsh R, Mozaffari-Kermani M. High-performance FPGA accelerator for SIKE[J]. IEEE Transactions on Computers, 2021, 71(6): 1237−1248
|
[16] |
Ricci S, Malina L, Jedlicka P, et al. Implementing CRYSTALS-Dilithium signature scheme on FPGAs [C/OL]//Proc of the 16th Int Conf on Availability, Reliability and Security. New York: ACM, 2021 [2024-09-14]. https://dl.acm.org/doi/abs/10.1145/3465481.3465756
|
[17] |
Shim K A, Lee S, Koo N. Efficient implementations of Rainbow and UOV using AVX2 [J]. IACR Transactions on Cryptographic Hardware and Embedded Systems, 2022: 245−269 [2024-09-14]. https://tches.iacr.org/index.php/TCHES/article/view/9296
|
[18] |
Chou T, Liou J H. A constant-time AVX2 implementation of a variant of ROLLO [J]. IACR Transactions on Cryptographic Hardware and Embedded Systems, 2022: 152−174 [2024-09-14]. https://tches.iacr.org/index.php/TCHES/article/view/9293
|
[19] |
Drucker N, Gueron S. Speed up over the Rainbow [C] // Proc of the 18th Int Conf on Information Technology-New Generations. Berlin: Springer, 2021: 131−136
|
[20] |
Abdulrahman A, Hwang V, Kannwischer M J, et al. Faster Kyber and Dilithium on the cortex-m4 [C] //Proc of the Int Conf on Applied Cryptography and Network Security. Berlin: Springer, 2022: 853−871
|
[21] |
Botros L, Kannwischer M J, Schwabe P. Memory-efficient high-speed implementation of Kyber on Cortex-M4 [C] //Proc of the 11th Int Conf on Cryptology in Africa. Berlin: Springer, 2019: 209−228
|
[22] |
Kannwischer M J, Rijneveld J, Schwabe P. Faster multiplication in on Cortex-M4 to speed up NIST PQC candidates [C] //Proc of the Int Conf on Applied Cryptography and Network Security. Berlin: Springer, 2019: 281−301
|
[23] |
Chung C M M, Hwang V, Kannwischer M J, et al. NTT multiplication for NTT-unfriendly rings: New speed records for Saber and NTRU on Cortex-M4 and AVX2 [J/OL]. IACR Transactions on Cryptographic Hardware and Embedded Systems, 2021: 159−188 [2024-09-14]. https://tches.iacr.org/index.php/TCHES/article/view/8791
|
[24] |
Hwang V. Pushing the limit of vectorized polynomial multiplications for NTRU prime [C] //Proc of the Australasian Conf on Information Security and Privacy. Berlin: Springer, 2024: 84−102
|
[25] |
Alkim E,Cheng D Y L,Chung C M M,et al. Polynomial multiplication in NTRU prime:Comparison of optimization strategies on Cortex-M4 [J]. Cryptology ePrint Archive,2020
|
[26] |
Dai Wei, Whyte W, Zhang Zhenfei. Optimizing polynomial convolution for NTRUEncrypt[J]. IEEE Transactions on Computers, 2018, 67(11): 1572−1583
|
[27] |
Becker H,Mera J M B,Karmakar A,et al. Polynomial multiplication on embedded vector architectures [J]. Cryptology ePrint Archive,2021
|
[28] |
Bernstein D J, Brumley B B, Chen M S, et al. OpenSSLNTRU: Faster post-quantum TLS key exchange [C] //Proc of the 31st USENIX Security Symp. Berkeley, CA: USENIX Association, 2022: 845−862
|
[29] |
Fujisaki E,Okamoto T. Secure integration of asymmetric and symmetric encryption schemes [C] //Proc of the Annual Int Cryptology Conf. Berlin:Springer,1999:537−554
|
[30] |
Hofheinz D,Hövelmanns K,Kiltz E. A modular analysis of the Fujisaki-Okamoto transformation [C] //Proc of the Theory of Cryptography Conf. Berlin:Springer,2017:341−371
|
[31] |
Duman J,Hövelmanns K,Kiltz E,et al. Faster lattice-based KEMs via a generic Fujisaki-Okamoto transform using prefix hashing[C] //Proc of the 2021 ACM SIGSAC Conf on Computer and Communications Security. New York:ACM,2021:2722−2737
|
[32] |
Nussbaumer H J, Nussbaumer H J. Number theoretic transforms [J/OL]. Fast Fourier Transform and Convolution Algorithms, 1981: 211−240[2024-09-14]. https://link.springer.com/chapter/10.1007/978-3-662-00551-4_8
|
[33] |
Seiler G. Faster AVX2 optimized NTT multiplication for Ring-LWE lattice cryptography [J]. Cryptology ePrint Archive,2018
|
[34] |
Cooley J W, Tukey J W. An algorithm for the machine calculation of complex Fourier series[J]. Mathematics of Computation, 1965, 19(90): 297−301 doi: 10.1090/S0025-5718-1965-0178586-1
|
[35] |
Gentleman W M, Sande G. Fast fourier transforms: For fun and profit [C]//Proc of the Fall Joint Computer Conf. New York: ACM, 1966: 563−578
|
[36] |
Weimerskirch A, Paar C. Generalizations of the Karatsuba algorithm for efficient implementations [J]. Cryptology ePrint Archive, 2006
|
[37] |
Bernstein D J, Yang B Y. Fast constant-time gcd computation and modular inversion [J/OL]. IACR Transactions on Cryptographic Hardware and Embedded Systems, 2019: 340−398 [2024-09-14]. https://tches.iacr.org/index.php/TCHES/article/view/8298
|
[38] |
Fog A. Instruction tables: Lists of instruction latencies, throughputs and micro-operation breakdowns for Intel, AMD, and VIA CPUs [EB/OL]. [2024-09-14]. https://www.agner.org/optimize/instruction_tables.pdf
|
[39] |
Barrett P. Implementing the Rivest Shamir and Adleman public key encryption algorithm on a standard digital signal processor [C]//Proc of the Conf on the Theory and Application of Cryptographic Techniques. Berlin: Springer, 1986: 311−323
|
[40] |
Montgomery P L. Modular multiplication without trial division[J]. Mathematics of Computation, 1985, 44(170): 519−521 doi: 10.1090/S0025-5718-1985-0777282-X
|
[41] |
Nguyen D T, Gaj K. Optimized software implementations of CRYSTALS-Kyber, NTRU, and Saber using NEON-based special instructions of ARMv8 [C/OL]//Proc of the 3rd NIST PQC Standardization Conf. 2021 [2024-09-14]. https://csrc.nist.gov/CSRC/media/Events/third-pqc-standardization-conference/documents/accepted-papers/nguyen-optimized-software-gmu-pqc2021.pdf
|
[42] |
Greub W H. Linear Algebra [M]. Berlin: Springer, 2012
|
[43] |
Dworkin M J. SHA−3 Standard: Permutation-based Hash and Extendable-output Functions [S]. Gaithersburg: The National Institute of Standards and Technology (NIST), 2015
|
[44] |
Schwabe P, Seiller G, Lepoint T et al. Kyber-post-quantum cryptography [CP/OL]. [2024-09-11]. https://github.com/pq-crystals/kyber
|
[1] | Yan Xinkai, Huo Yuchi, Bao Hujun. Survey on Neural Rendering and Its Hardware Acceleration[J]. Journal of Computer Research and Development, 2024, 61(11): 2846-2869. DOI: 10.7544/issn1000-1239.202330483 |
[2] | Zhu Suxia, Wang Jinyin, Sun Guanglu. Perceptual Similarity-Based Multi-Objective Optimization for Stealthy Image Backdoor Attack[J]. Journal of Computer Research and Development, 2024, 61(5): 1182-1192. DOI: 10.7544/issn1000-1239.202330521 |
[3] | Ma Xinyu, Fan Yixing, Guo Jiafeng, Zhang Ruqing, Su Lixin, Cheng Xueqi. An Empirical Investigation of Generalization and Transfer in Short Text Matching[J]. Journal of Computer Research and Development, 2022, 59(1): 118-126. DOI: 10.7544/issn1000-1239.20200626 |
[4] | Liu Yifan, Xu Kun. A Survey on Many-Lights Rendering Methods[J]. Journal of Computer Research and Development, 2020, 57(1): 17-31. DOI: 10.7544/issn1000-1239.2020.20190208 |
[5] | Li Ying, Tang Yong, Zhang Haoran, Liu Ding, Zhou Shengteng, Wang Sai. Real Time Rendering of Large Scale Realistic Ocean Scenes Driven by Time and Space[J]. Journal of Computer Research and Development, 2019, 56(2): 375-384. DOI: 10.7544/issn1000-1239.2019.20170895 |
[6] | Zheng Liping, Chan Bin, Wang Wenping, Liu Xiaoping, Cao Li, Kuang Zhengzheng. Remote Visualization Based on Distributed Rendering Framework[J]. Journal of Computer Research and Development, 2012, 49(7): 1438-1449. |
[7] | Qian Xiaoyan, Xiao Liang, Wu Huizhong. An Adaptive LIC Rendering Method for Fluid Artistic Style[J]. Journal of Computer Research and Development, 2007, 44(9): 1588-1594. |
[8] | Hu Wei and Qin Kaihuai. A New Rendering Technology of GPU-Accelerated Radiosity[J]. Journal of Computer Research and Development, 2005, 42(6): 945-950. |
[9] | Ma Renan, Zhang Erhua, Yang Jinyu, and Zhao Chunxia. Research on Segmenting and Volume Rendering of Irregular Seismic Events[J]. Journal of Computer Research and Development, 2005, 42(5): 883-887. |
[10] | Huang Hua, Luo Siwei, Liu Yunhui, and Li Aijun. Knowledge Increase Ability of Artificial Neural Network[J]. Journal of Computer Research and Development, 2005, 42(2): 224-229. |