• 中国精品科技期刊
  • CCF推荐A类中文期刊
  • 计算领域高质量科技期刊T1类
Advanced Search
Liu Weixin, Guan Yewei, Huo Jiarong, Ding Yuanchao, Guo Hua, Li Bo. A Fast and Secure Transformer Inference Scheme with Secure Multi-Party Computation[J]. Journal of Computer Research and Development, 2024, 61(5): 1218-1229. DOI: 10.7544/issn1000-1239.202330966
Citation: Liu Weixin, Guan Yewei, Huo Jiarong, Ding Yuanchao, Guo Hua, Li Bo. A Fast and Secure Transformer Inference Scheme with Secure Multi-Party Computation[J]. Journal of Computer Research and Development, 2024, 61(5): 1218-1229. DOI: 10.7544/issn1000-1239.202330966

A Fast and Secure Transformer Inference Scheme with Secure Multi-Party Computation

Funds: This work was supported by the National Key Research and Development Program of China (2021YFB2700200) and the National Natural Science Foundation of China (U21B2021, 61972018, 61932014).
More Information
  • Author Bio:

    Liu Weixin: born in 2001. Master candidate. His main research interests include privacy-preserving machine learning, cryptography

    Guan Yewei: born in 1999. PhD candidate. His main research interests include secure multi-party computation and privacy-preserving machine learning

    Huo Jiarong: born in 2002. Undergraduate. His main research interest includes applied cryptography

    Ding Yuanchao: born in 1999. Master candidate. His main research interests include privacy-preserving machine learning and cryptography

    Guo Hua: born in 1980. PhD, associate professor. Member of CCF. Her main research interests include privacy-preserving machine learning and cryptography

    Li Bo: born in 1981. PhD, associate professor. Member of CCF. His main research interests include privacy-preserving machine learning and network security

  • Received Date: November 30, 2023
  • Revised Date: March 10, 2024
  • Accepted Date: March 10, 2024
  • Available Online: March 10, 2024
  • Transformer has been widely used in many fields such as natural language processing and computer vision, and has outstanding performance. The users’ data will be leaked to the Transformer model provider during inference. With the increasing public attention on data privacy, the above data leakage problem has triggered researchers’ study on secure Transformer inference. Implementing secure Transformer inference with secure multi-party computation (MPC) is today’s hot topic. Due to the widely existence of non-linear functions in Transformer, it is hard to use MPC to implement secure Transformer inference, which leads to huge computation and communication cost. We focus on Softmax attention, bottleneck in secure Transformer inference, and propose two kinds of MPC-friendly attention mechanism, Softmax freeDiv Attention and 2Quad freeDiv Attention. By replacing the Softmax attention in Transformer with the MPC-friendly attention mechanism proposed, combining with the replacement of activation function GeLU and knowledge distillation, we propose an MPC-friendly Transformer convert framework, which can convert Transformer model to an MPC-friendly one, so as to improve the performance of secure Transformer inference later. Based on the proposed MPC-friendly Transformer convert framework , we perform secure Bert-Base inference on SST-2 in the LAN setting, using privacy computing protocols provided by secure processing unit (SPU). The result shows that the secure inference achieves 2.26 times speedup while maintaining the accuracy with non-approximation model.

  • [1]
    Hoffmann J, Borgeaud S, Mensch A, et al. An empirical analysis of compute-optimal large language model training[J]. Advances in Neural Information Processing Systems, 2022, 35: 30016−30030
    [2]
    Chan S, Santoro A, Lampinen A, et al. Data distributional properties drive emergent in-context learning in transformers[J]. Advances in Neural Information Processing Systems, 2022, 35: 18878−18891
    [3]
    Liu Ze, Lin Yutong, Cao Yue, et al. Swin transformer: Hierarchical vision transformer using shifted windows[C]//Proc of the IEEE/CVF Int Conf on Computer Vision. Piscataway, NJ: IEEE , 2021: 10012−10022
    [4]
    Liu Ze, Hu Han, Lin Yutong, et al. Swin transformer v2: Scaling up capacity and resolution[C]//Proc of the IEEE/CVF Conf on Computer Vision and Pattern Recognition. Piscataway, NJ: IEEE, 2022: 12009−12019
    [5]
    Jawalkar N, Gupta K, Basu A, et al. Orca: FSS-based secure training with GPUs[J]. Cryptology ePrint Archive, 2023
    [6]
    Hao Meng, Li Hongwei, Chen Hanxiao, et al. Iron: Private inference on transformers[J]. Advances in Neural Information Processing Systems, 2022, 35: 15718−15731
    [7]
    Chen Tianyu, Bao Hangbo, Huang Shaohan, et al. THE-X: Privacy-preserving transformer inference with homomorphic encryption[C]//Findings of the Association for Computational Linguistics: ACL 2022. Stroudsburg, PA: ACL, 2022: 3510−3520
    [8]
    Zheng Mengxin, Lou Qian, Lei Jiang. Primer: Fast private transformer inference on encrypted data[J]. arXiv preprint, arXiv: 2303.13679, 2023
    [9]
    Gupta K, Jawalkar N, Mukherjee A, et al. Sigma: Secure gpt inference with function secret sharing[J]. Cryptology ePrint Archive, 2023
    [10]
    Juvekar C, Vaikuntanathan V, Chandrakasan A. GAZELLE: A low latency framework for secure neural network inference[C]//Proc of the 27th USENIX Conf on Security Symp. Berkeley, CA: USENIX Association, 2018: 1651−1669
    [11]
    Jiang Xiaoqian, Kim M, Lauter K, et al. Secure outsourced matrix computation and application to neural networks[C]//Proc of the 2018 ACM SIGSAC Conf on Computer and Communications Security. New York: Association for Computing Machinery, 2018: 1209−1222
    [12]
    Mohassel P, Zhang Y. SecureML: A system for scalable privacy-preserving machine learning[C]//Proc of 2017 IEEE Symp on Security and Privacy (SP). Piscataway, NJ: IEEE, 2017: 19−38
    [13]
    Huang Zhicong, Lu Wenjie, Hong Cheng, et al. Cheetah: Lean and fast secure two-party deep neural network inference[C]//Proc of the 31st USENIX Security Symp (USENIX Security 22). Berkeley, CA: USENIX Association, 2022: 809−826
    [14]
    Wang Ning, Xiao Xiaohui, Yang Yin, et al. Collecting and analyzing multidimensional data with local differential privacy[C]//Proc of the 2019 IEEE 35th Int Conf on Data Engineering (ICDE). Piscataway, NJ: IEEE, 2019: 638−649
    [15]
    Truong J B, Gallagher W, Guo Tian, et al. Memory-efficient deep learning inference in trusted execution environments[C]//Proc of the 2021 IEEE Int Conf on Cloud Engineering (IC2E). Piscataway, NJ: IEEE, 2021: 161−167
    [16]
    Akavia A, Leibovich M, Resheff Y S, et al. Privacy-preserving decision trees training and prediction[J]. ACM Transactions on Privacy and Security, 2022, 25(3): 1−30
    [17]
    Park S, Byun J, Lee J. Privacy-preserving fair learning of support vector machine with homomorphic encryption[C]//Proc of the ACM Web Conf 2022. New York: ACM, 2022: 3572−3583
    [18]
    Mohassel P, Rindal P. ABY3: A mixed protocol framework for machine learning[C]//Proc of the 2018 ACM SIGSAC Conf on Computer and Communications Security. New York: ACM, 2018: 35−52
    [19]
    Rathee D, Rathee M, Kumar N, et al. Cryptflow2: Practical 2-party secure inference[C]//Proc of the 2020 ACM SIGSAC Conf on Computer and Communications Security. New York: ACM, 2020: 325−342
    [20]
    Hou Xiaoyang, Liu Jian, Li Jingyu, et al. Ciphergpt: Secure two-party gpt inference[J]. Cryptology ePrint Archive, 2023
    [21]
    Li Dacheng, Shao Rulin, Wang Hongyi, et al. MPCFormer: Fast, performant and private transformer inference with MPC[J]. arXiv preprint, arXiv: 2211.01452, 2022
    [22]
    Zeng Wenxuan, Li Meng, Xiong Wenjie, et al. MPCViT: Searching for MPC-friendly vision transformer with heterogeneous attention[J]. arXiv preprint, arXiv: 2211.13955, 2022
    [23]
    Mishra P, Lehmkuhl R, Srinivasan A, et al. Delphi: A cryptographic inference system for neural networks[C]//Proc of the 29th USENIX Conf on Security Symp. Berkeley, CA: USENIX Association, 2020: 2505−2522
    [24]
    Wang Xiaolong, Girshick R, Gupta A, et al. Non-local neural networks[C]//Proc of the IEEE Conf on Computer Vision and Pattern Recognition. Piscataway, NJ: IEEE, 2018: 7794−7803
    [25]
    Wang Sinong, Li B Z, Khabsa M, et al. Linformer: Self-attention with linear complexity[J]. arXiv preprint, arXiv: 2006.04768, 2020
    [26]
    Rathee D, Rathee M, Goli R K K, et al. Sirnn: A math library for secure RNN inference[C]//Proc of 2021 IEEE Symp on Security and Privacy (SP). Piscataway, NJ: IEEE, 2021: 1003−1020
    [27]
    Dong Ye, Lu Wenjie, Zheng Yancheng, et al. Puma: Secure inference of LLaMA-7B in five minutes[J]. arXiv preprint, arXiv: 2307.12533, 2023
    [28]
    Akimoto Y, Fukuchi K, Akimoto Y, et al. Privformer: Privacy-preserving transformer with MPC[C]//Proc of 2023 IEEE 8th European Symp on Security and Privacy (EuroS&P). Piscataway, NJ: IEEE, 2023: 392−410
    [29]
    Wagh S, Tople S, Benhamouda F, et al. Falcon: Honest-majority maliciously secure framework for private deep learning[J]. arXiv preprint, arXiv: 2004.02229, 2020
    [30]
    Chou E, Beal J, Levy D, et al. Faster Cryptonets: Leveraging sparsity for real-world encrypted inference[J]. arXiv preprint, arXiv: 1811.09953, 2018
    [31]
    Hinton G, Vinyals O, Dean J. Distilling the knowledge in a neural network[J]. arXiv preprint , arXiv: 1503.02531, 2015
    [32]
    Ma Junming, Zheng Yancheng, Feng Jun, et al. SecretFlow-SPU: A performant and user-friendly framework for privacy-preserving machine learning[C]//Proc of 2023 USENIX Annual Technical Conf (USENIX ATC 23). Berkeley, CA: USENIX Associatioin, 2023: 17−33
  • Related Articles

    [1]Yang Lihua, Dong Yong, Wu Huijun, Tan Zhipeng, Wang Fang, Lu Kai. Survey of Log-Structured File Systems in Mobile Devices[J]. Journal of Computer Research and Development, 2025, 62(1): 58-74. DOI: 10.7544/issn1000-1239.202330789
    [2]Chen Huimin, Jin Sichen, Lin Wei, Zhu Zeyu, Tong Lingbo, Liu Yipeng, Ye Yining, Jiang Weihan, Liu Zhiyuan, Sun Maosong, Jin Jianbin. Quantitative Analysis on the Communication of COVID-19 Related Social Media Rumors[J]. Journal of Computer Research and Development, 2021, 58(7): 1366-1384. DOI: 10.7544/issn1000-1239.2021.20200818
    [3]Guo Hongyi, Liu Gongshen, Su Bo, Meng Kui. Collaborative Filtering Recommendation Algorithm Combining Community Structure and Interest Clusters[J]. Journal of Computer Research and Development, 2016, 53(8): 1664-1672. DOI: 10.7544/issn1000-1239.2016.20160175
    [4]Wang Di, Zhao Tianlei, Tang Yuxing, Dou Qiang. A Communication Feature-Oriented 3D NoC Architecture Design[J]. Journal of Computer Research and Development, 2014, 51(9): 1971-1979. DOI: 10.7544/issn1000-1239.2014.20130131
    [5]Chen Ping, Xing Xiao, Xin Zhi, Wang Yi, Mao Bing, and Xie Li. Protecting Programs Based on Randomizing the Encapsulated Structure[J]. Journal of Computer Research and Development, 2011, 48(12): 2227-2234.
    [6]Li Shaofang, Hu Shanli, Shi Chunyi. An Anytime Coalition Structure Generation Based on the Grouping Idea of Cardinality Structure[J]. Journal of Computer Research and Development, 2011, 48(11): 2047-2054.
    [7]Liu Jinglei, Zhang Wei, Liu Zhaowei, and Sun Xuejiao. Properties and Application of Coalition Structure Graph[J]. Journal of Computer Research and Development, 2011, 48(4): 602-609.
    [8]Su Shexiong, Hu Shanli, Zheng Shengfu, Lin Chaofeng, and Luo Jianbin. An Anytime Coalition Structure Generation Algorithm Based on Cardinality Structure[J]. Journal of Computer Research and Development, 2008, 45(10): 1756.
    [9]Cao Yafei, Wang Dawei, and Li Sikun. A Novel System-Level Communication Synthesis Methodology Containing Crossbar Bus and Shared Bus[J]. Journal of Computer Research and Development, 2008, 45(8): 1439-1445.
    [10]Zheng Zhirong, Cai Yi, and Shen Changxiang. Research on an Application Class Communication Security Model on Operating System Security Framework[J]. Journal of Computer Research and Development, 2005, 42(2): 322-328.
  • Cited by

    Periodical cited type(5)

    1. 何业锋,刘闪闪,刘妍,权家辉,田哲铭,杨梦玫,李智. 支持虚拟车辆辅助假名更新的混合区位置隐私保护方案. 计算机应用研究. 2024(01): 272-276 .
    2. 况博裕,李雨泽,顾芳铭,苏铓,付安民. 车联网安全研究综述:威胁、对策与未来展望. 计算机研究与发展. 2023(10): 2304-2321 . 本站查看
    3. 王佳星,周武源,李甜甜. 人工智能发展态势的文献计量分析与研究. 小型微型计算机系统. 2023(11): 2424-2433 .
    4. 张迪,曹利,李原帅. 车联网环境下基于多策略访问树的安全访问控制算法. 计算机应用研究. 2023(11): 3394-3401 .
    5. 邓雨康,张磊,李晶. 车联网隐私保护研究综述. 计算机应用研究. 2022(10): 2891-2906 .

    Other cited types(2)

Catalog

    Article views (548) PDF downloads (194) Cited by(7)

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return