• 中国精品科技期刊
  • CCF推荐A类中文期刊
  • 计算领域高质量科技期刊T1类
Advanced Search
Liu Weixin, Guan Yewei, Huo Jiarong, Ding Yuanchao, Guo Hua, Li Bo. A Fast and Secure Transformer Inference Scheme with Secure Multi-Party Computation[J]. Journal of Computer Research and Development, 2024, 61(5): 1218-1229. DOI: 10.7544/issn1000-1239.202330966
Citation: Liu Weixin, Guan Yewei, Huo Jiarong, Ding Yuanchao, Guo Hua, Li Bo. A Fast and Secure Transformer Inference Scheme with Secure Multi-Party Computation[J]. Journal of Computer Research and Development, 2024, 61(5): 1218-1229. DOI: 10.7544/issn1000-1239.202330966

A Fast and Secure Transformer Inference Scheme with Secure Multi-Party Computation

Funds: This work was supported by the National Key Research and Development Program of China (2021YFB2700200) and the National Natural Science Foundation of China (U21B2021, 61972018, 61932014).
More Information
  • Author Bio:

    Liu Weixin: born in 2001. Master candidate. His main research interests include privacy-preserving machine learning, cryptography

    Guan Yewei: born in 1999. PhD candidate. His main research interests include secure multi-party computation and privacy-preserving machine learning

    Huo Jiarong: born in 2002. Undergraduate. His main research interest includes applied cryptography

    Ding Yuanchao: born in 1999. Master candidate. His main research interests include privacy-preserving machine learning and cryptography

    Guo Hua: born in 1980. PhD, associate professor. Member of CCF. Her main research interests include privacy-preserving machine learning and cryptography

    Li Bo: born in 1981. PhD, associate professor. Member of CCF. His main research interests include privacy-preserving machine learning and network security

  • Received Date: November 30, 2023
  • Revised Date: March 10, 2024
  • Accepted Date: March 10, 2024
  • Available Online: March 10, 2024
  • Transformer has been widely used in many fields such as natural language processing and computer vision, and has outstanding performance. The users’ data will be leaked to the Transformer model provider during inference. With the increasing public attention on data privacy, the above data leakage problem has triggered researchers’ study on secure Transformer inference. Implementing secure Transformer inference with secure multi-party computation (MPC) is today’s hot topic. Due to the widely existence of non-linear functions in Transformer, it is hard to use MPC to implement secure Transformer inference, which leads to huge computation and communication cost. We focus on Softmax attention, bottleneck in secure Transformer inference, and propose two kinds of MPC-friendly attention mechanism, Softmax freeDiv Attention and 2Quad freeDiv Attention. By replacing the Softmax attention in Transformer with the MPC-friendly attention mechanism proposed, combining with the replacement of activation function GeLU and knowledge distillation, we propose an MPC-friendly Transformer convert framework, which can convert Transformer model to an MPC-friendly one, so as to improve the performance of secure Transformer inference later. Based on the proposed MPC-friendly Transformer convert framework , we perform secure Bert-Base inference on SST-2 in the LAN setting, using privacy computing protocols provided by secure processing unit (SPU). The result shows that the secure inference achieves 2.26 times speedup while maintaining the accuracy with non-approximation model.

  • [1]
    Hoffmann J, Borgeaud S, Mensch A, et al. An empirical analysis of compute-optimal large language model training[J]. Advances in Neural Information Processing Systems, 2022, 35: 30016−30030
    [2]
    Chan S, Santoro A, Lampinen A, et al. Data distributional properties drive emergent in-context learning in transformers[J]. Advances in Neural Information Processing Systems, 2022, 35: 18878−18891
    [3]
    Liu Ze, Lin Yutong, Cao Yue, et al. Swin transformer: Hierarchical vision transformer using shifted windows[C]//Proc of the IEEE/CVF Int Conf on Computer Vision. Piscataway, NJ: IEEE , 2021: 10012−10022
    [4]
    Liu Ze, Hu Han, Lin Yutong, et al. Swin transformer v2: Scaling up capacity and resolution[C]//Proc of the IEEE/CVF Conf on Computer Vision and Pattern Recognition. Piscataway, NJ: IEEE, 2022: 12009−12019
    [5]
    Jawalkar N, Gupta K, Basu A, et al. Orca: FSS-based secure training with GPUs[J]. Cryptology ePrint Archive, 2023
    [6]
    Hao Meng, Li Hongwei, Chen Hanxiao, et al. Iron: Private inference on transformers[J]. Advances in Neural Information Processing Systems, 2022, 35: 15718−15731
    [7]
    Chen Tianyu, Bao Hangbo, Huang Shaohan, et al. THE-X: Privacy-preserving transformer inference with homomorphic encryption[C]//Findings of the Association for Computational Linguistics: ACL 2022. Stroudsburg, PA: ACL, 2022: 3510−3520
    [8]
    Zheng Mengxin, Lou Qian, Lei Jiang. Primer: Fast private transformer inference on encrypted data[J]. arXiv preprint, arXiv: 2303.13679, 2023
    [9]
    Gupta K, Jawalkar N, Mukherjee A, et al. Sigma: Secure gpt inference with function secret sharing[J]. Cryptology ePrint Archive, 2023
    [10]
    Juvekar C, Vaikuntanathan V, Chandrakasan A. GAZELLE: A low latency framework for secure neural network inference[C]//Proc of the 27th USENIX Conf on Security Symp. Berkeley, CA: USENIX Association, 2018: 1651−1669
    [11]
    Jiang Xiaoqian, Kim M, Lauter K, et al. Secure outsourced matrix computation and application to neural networks[C]//Proc of the 2018 ACM SIGSAC Conf on Computer and Communications Security. New York: Association for Computing Machinery, 2018: 1209−1222
    [12]
    Mohassel P, Zhang Y. SecureML: A system for scalable privacy-preserving machine learning[C]//Proc of 2017 IEEE Symp on Security and Privacy (SP). Piscataway, NJ: IEEE, 2017: 19−38
    [13]
    Huang Zhicong, Lu Wenjie, Hong Cheng, et al. Cheetah: Lean and fast secure two-party deep neural network inference[C]//Proc of the 31st USENIX Security Symp (USENIX Security 22). Berkeley, CA: USENIX Association, 2022: 809−826
    [14]
    Wang Ning, Xiao Xiaohui, Yang Yin, et al. Collecting and analyzing multidimensional data with local differential privacy[C]//Proc of the 2019 IEEE 35th Int Conf on Data Engineering (ICDE). Piscataway, NJ: IEEE, 2019: 638−649
    [15]
    Truong J B, Gallagher W, Guo Tian, et al. Memory-efficient deep learning inference in trusted execution environments[C]//Proc of the 2021 IEEE Int Conf on Cloud Engineering (IC2E). Piscataway, NJ: IEEE, 2021: 161−167
    [16]
    Akavia A, Leibovich M, Resheff Y S, et al. Privacy-preserving decision trees training and prediction[J]. ACM Transactions on Privacy and Security, 2022, 25(3): 1−30
    [17]
    Park S, Byun J, Lee J. Privacy-preserving fair learning of support vector machine with homomorphic encryption[C]//Proc of the ACM Web Conf 2022. New York: ACM, 2022: 3572−3583
    [18]
    Mohassel P, Rindal P. ABY3: A mixed protocol framework for machine learning[C]//Proc of the 2018 ACM SIGSAC Conf on Computer and Communications Security. New York: ACM, 2018: 35−52
    [19]
    Rathee D, Rathee M, Kumar N, et al. Cryptflow2: Practical 2-party secure inference[C]//Proc of the 2020 ACM SIGSAC Conf on Computer and Communications Security. New York: ACM, 2020: 325−342
    [20]
    Hou Xiaoyang, Liu Jian, Li Jingyu, et al. Ciphergpt: Secure two-party gpt inference[J]. Cryptology ePrint Archive, 2023
    [21]
    Li Dacheng, Shao Rulin, Wang Hongyi, et al. MPCFormer: Fast, performant and private transformer inference with MPC[J]. arXiv preprint, arXiv: 2211.01452, 2022
    [22]
    Zeng Wenxuan, Li Meng, Xiong Wenjie, et al. MPCViT: Searching for MPC-friendly vision transformer with heterogeneous attention[J]. arXiv preprint, arXiv: 2211.13955, 2022
    [23]
    Mishra P, Lehmkuhl R, Srinivasan A, et al. Delphi: A cryptographic inference system for neural networks[C]//Proc of the 29th USENIX Conf on Security Symp. Berkeley, CA: USENIX Association, 2020: 2505−2522
    [24]
    Wang Xiaolong, Girshick R, Gupta A, et al. Non-local neural networks[C]//Proc of the IEEE Conf on Computer Vision and Pattern Recognition. Piscataway, NJ: IEEE, 2018: 7794−7803
    [25]
    Wang Sinong, Li B Z, Khabsa M, et al. Linformer: Self-attention with linear complexity[J]. arXiv preprint, arXiv: 2006.04768, 2020
    [26]
    Rathee D, Rathee M, Goli R K K, et al. Sirnn: A math library for secure RNN inference[C]//Proc of 2021 IEEE Symp on Security and Privacy (SP). Piscataway, NJ: IEEE, 2021: 1003−1020
    [27]
    Dong Ye, Lu Wenjie, Zheng Yancheng, et al. Puma: Secure inference of LLaMA-7B in five minutes[J]. arXiv preprint, arXiv: 2307.12533, 2023
    [28]
    Akimoto Y, Fukuchi K, Akimoto Y, et al. Privformer: Privacy-preserving transformer with MPC[C]//Proc of 2023 IEEE 8th European Symp on Security and Privacy (EuroS&P). Piscataway, NJ: IEEE, 2023: 392−410
    [29]
    Wagh S, Tople S, Benhamouda F, et al. Falcon: Honest-majority maliciously secure framework for private deep learning[J]. arXiv preprint, arXiv: 2004.02229, 2020
    [30]
    Chou E, Beal J, Levy D, et al. Faster Cryptonets: Leveraging sparsity for real-world encrypted inference[J]. arXiv preprint, arXiv: 1811.09953, 2018
    [31]
    Hinton G, Vinyals O, Dean J. Distilling the knowledge in a neural network[J]. arXiv preprint , arXiv: 1503.02531, 2015
    [32]
    Ma Junming, Zheng Yancheng, Feng Jun, et al. SecretFlow-SPU: A performant and user-friendly framework for privacy-preserving machine learning[C]//Proc of 2023 USENIX Annual Technical Conf (USENIX ATC 23). Berkeley, CA: USENIX Associatioin, 2023: 17−33
  • Related Articles

    [1]Qi Lei, Ren Zihao, Liu Junxi, Geng Xin. Person Re-identification Method Based on Hybrid Real-Synthetic Data[J]. Journal of Computer Research and Development, 2025, 62(2): 418-431. DOI: 10.7544/issn1000-1239.202330718
    [2]Huang Yiwang, Huang Yuxin, Liu Sheng. A Lightweight Noise Label Learning Method Based on Online Distillation[J]. Journal of Computer Research and Development, 2024, 61(12): 3121-3133. DOI: 10.7544/issn1000-1239.202330382
    [3]Yu Ying, Wei Wei, Tang Hong, Qian Jin. Multi-Stage Training with Multi-Level Knowledge Self-Distillation for Fine-Grained Image Recognition[J]. Journal of Computer Research and Development, 2023, 60(8): 1834-1845. DOI: 10.7544/issn1000-1239.202330262
    [4]Zhang Jing, Ju Jialiang, Ren Yonggong. Double-Generators Network for Data-Free Knowledge Distillation[J]. Journal of Computer Research and Development, 2023, 60(7): 1615-1627. DOI: 10.7544/issn1000-1239.202220024
    [5]Zhang Shuqin, Bai Guangyao, Li Hong, Zhang Minzhi. IoT Security Knowledge Reasoning Method of Multi-Source Data Fusion[J]. Journal of Computer Research and Development, 2022, 59(12): 2735-2749. DOI: 10.7544/issn1000-1239.20210954
    [6]Wei Xiushen, Xu Shulin, An Peng, Yang Jian. Multi-Instance Learning with Incremental Classes[J]. Journal of Computer Research and Development, 2022, 59(8): 1723-1731. DOI: 10.7544/issn1000-1239.20220071
    [7]Ma Ang, Yu Yanhua, Yang Shengli, Shi Chuan, Li Jie, Cai Xiuxiu. Survey of Knowledge Graph Based on Reinforcement Learning[J]. Journal of Computer Research and Development, 2022, 59(8): 1694-1722. DOI: 10.7544/issn1000-1239.20211264
    [8]Yang Lin, Zhang Libo, Luo Tiejian, Wan Qiyang, Wu Yanjun. Knowledge Schematization Method Based on Link and Semantic Relationship[J]. Journal of Computer Research and Development, 2017, 54(8): 1655-1664. DOI: 10.7544/issn1000-1239.2017.20170177
    [9]Gu Yonggen, Fu Yuxi. Formal Analysis of Security Protocol Based on Process Calculus and Knowledge Derivation[J]. Journal of Computer Research and Development, 2006, 43(5): 953-958.
    [10]Zhan Yongzhao, Wang Jinfeng, and Mao Qirong. Nested Knowledge Space Model and Awareness Processing in a Collaborative Learning Environment[J]. Journal of Computer Research and Development, 2005, 42(7): 1159-1165.
  • Cited by

    Periodical cited type(2)

    1. 马乾骏,郭虎升,王文剑. 在线深度神经网络的弱监督概念漂移检测方法. 小型微型计算机系统. 2024(09): 2094-2101 .
    2. 韩光洁,赵腾飞,刘立,张帆,徐政伟. 基于多元区域集划分的工业数据流概念漂移检测. 电子学报. 2023(07): 1906-1916 .

    Other cited types(6)

Catalog

    Article views (499) PDF downloads (173) Cited by(8)

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return