Advanced Search
    Wang Chuang, Ding Yan, Huang Chenlin, Song Liantao. Bitsliced Optimization of SM4 Algorithm with the SIMD Instructions[J]. Journal of Computer Research and Development. DOI: 10.7544/issn1000-1239.202220531
    Citation: Wang Chuang, Ding Yan, Huang Chenlin, Song Liantao. Bitsliced Optimization of SM4 Algorithm with the SIMD Instructions[J]. Journal of Computer Research and Development. DOI: 10.7544/issn1000-1239.202220531

    Bitsliced Optimization of SM4 Algorithm with the SIMD Instructions

    • The SM4 algorithm is a commercial block cipher algorithm independently designed by China, and its encryption and decryption performance has become one of the critical factors affecting the data confidentiality of the information system. The existing optimizations mainly focus on hardware designs and software look-up tables, which have problems such as dependence on specific hardware environments, low efficiency, and vulnerability to side-channel attacks. Bit slicing technology efficiently processes block ciphers in parallel by reorganizing input data, and can resist side-channel attacks against caches. However, the existing researches on bitsliced block ciphers are highly dependent on the hardware platforms and only supports a single processor architecture, and the parallel processing pipeline starts slowly. It is difficult for the encryption and decryption operations for small-scale data to give full play to the advantages of advanced instruction sets such as SIMD(single instruction multiple data) instructions. To resolve the above problems, this paper firstly proposes a cross-platform general bitsliced block cipher algorithm model, which supports a general data slicing method that provides consistent data slicing for different processor instructions. Based on that, a fine-grained bitsliced SM4 optimization algorithm for SIMD instructions is proposed, which can effectively shorten the startup time of the algorithm through fine-grained plaintext slicing reorganization and linear transformation optimization. The experiments show that, compared with the look-up table-based SM4 algorithm, the encryption rate can reach up to 438.0 MBps. The clock cycles required for encrypting a byte are up to 7.0 CPB (cycle/B), and the encryption performance is improved by an average of 80.4%—430.3%.
    • loading

    Catalog

      Turn off MathJax
      Article Contents

      /

      DownLoad:  Full-Size Img  PowerPoint
      Return
      Return