ISSN 1000-1239 CN 11-1777/TP

Journal of Computer Research and Development ›› 2021, Vol. 58 ›› Issue (5): 1092-1105.doi: 10.7544/issn1000-1239.2021.20200908

Special Issue: 2021人工智能安全与隐私保护技术专题

Previous Articles     Next Articles

Evaluating Privacy Risks of Deep Learning Based General-Purpose Language Models

Pan Xudong, Zhang Mi, Yan Yifan, Lu Yifan, Yang Min   

  1. (School of Computer Science, Fudan University, Shanghai 200438)
  • Online:2021-05-01
  • Supported by: 
    This work was supported by the National Natural Science Foundation of China (61972099, U1636204, U1836213, U1836210, U1736208) and the Natural Science Foundation of Shanghai (19ZR1404800).

Abstract: Recently, a variety of Transformer-based GPLMs (general-purpose language models), including Google’s BERT (bidirectional encoder representation from transformers), are proposed in NLP (natural language processing). GPLMs help achieve state-of-the-art performance on a wide range of NLP tasks, and are applied in industrial applications. Despite their generality and promising performance, a recent research work first shows that an attacker, who has access to the textual embeddings produced by GPLMs, can infer whether the original text contains a specific keyword with high accuracy. However, the previous work has the following limitations. First, they only consider the occurrence of one sensitive word as the sensitive information to steal, which is still far from a threatening privacy violation. Besides, their attack requires several rather strict security assumptions on the attacker’s capability, e.g., the attacker knows which GPLM produces the victim’s textual embeddings. Moreover, they only consider the GPLMs designed for English texts. To address the aforementioned limitations and serve as a complement to their work, this paper proposes a more comprehensive privacy theft chain which is designed to explore whether there are even more privacy risks in general-purpose language models. Via experiments on 13 commercial GPLMs, we empirically show that an attacker can step by step infer the GPLM type behind the textual embedding with near 100% accuracy, then infer the textual length with over 70% on average and finally probe sensitive words that possibly occur in the original text, which brings useful information for the attacker to finally reconstruct the sensitive semantics. Besides, this paper also evaluates the privacy risks of three typical general-purpose language models in Chinese. The results confirm that privacy risks also exist in Chinese general-purpose language models, which calls for mitigation studies in the future.

Key words: deep learning privacy, general-purpose language model (GPLMs), natural language processing, deep learning, artificial intelligence, information security

CLC Number: