Abstract:
Digital signatures play a critical role in information security; however, traditional digital signature algorithms are at risk of becoming obsolete in the post-quantum era. SPHINCS
+, as a digital signature framework resistant to quantum computing attacks, is expected to become increasingly important in this new era. Nevertheless, the relatively slow computational speed of SPHINCS
+ poses challenges in meeting the high throughput and low latency demands of modern cryptographic applications, significantly limiting its practicality. This paper presents an efficient optimization strategy based on a domestic DCU (Deep Computing Unit) to accelerate the SPHINCS
+ algorithm instantiated with the domestic SM3 hash function. By enhancing memory copy efficiency, optimizing the computational processes of SM3 and SPHINCS
+, and employing optimal computational parallelism, we implemented the 128-f mode of SPHINCS
+-SM3 on the DCU. Experimental results demonstrate that, compared to traditional CPU implementations, our DCU-based implementation achieves a significant increase in throughput, improving signature generation and verification by
2603.87 times and
1281.98 times, respectively. This substantial improvement in computational efficiency and practicality enhances the feasibility of SPHINCS
+ and advances the domestic adoption of post-quantum cryptographic algorithms. In scenarios involving high data traffic and large volumes of signature requests, the DCU implementation exhibits significant performance advantages over CPU implementations.