• 中国精品科技期刊
  • CCF推荐A类中文期刊
  • 计算领域高质量科技期刊T1类
高级检索

通用深度学习语言模型的隐私风险评估

潘旭东, 张谧, 颜一帆, 陆逸凡, 杨珉

潘旭东, 张谧, 颜一帆, 陆逸凡, 杨珉. 通用深度学习语言模型的隐私风险评估[J]. 计算机研究与发展, 2021, 58(5): 1092-1105. DOI: 10.7544/issn1000-1239.2021.20200908
引用本文: 潘旭东, 张谧, 颜一帆, 陆逸凡, 杨珉. 通用深度学习语言模型的隐私风险评估[J]. 计算机研究与发展, 2021, 58(5): 1092-1105. DOI: 10.7544/issn1000-1239.2021.20200908
Pan Xudong, Zhang Mi, Yan Yifan, Lu Yifan, Yang Min. Evaluating Privacy Risks of Deep Learning Based General-Purpose Language Models[J]. Journal of Computer Research and Development, 2021, 58(5): 1092-1105. DOI: 10.7544/issn1000-1239.2021.20200908
Citation: Pan Xudong, Zhang Mi, Yan Yifan, Lu Yifan, Yang Min. Evaluating Privacy Risks of Deep Learning Based General-Purpose Language Models[J]. Journal of Computer Research and Development, 2021, 58(5): 1092-1105. DOI: 10.7544/issn1000-1239.2021.20200908
潘旭东, 张谧, 颜一帆, 陆逸凡, 杨珉. 通用深度学习语言模型的隐私风险评估[J]. 计算机研究与发展, 2021, 58(5): 1092-1105. CSTR: 32373.14.issn1000-1239.2021.20200908
引用本文: 潘旭东, 张谧, 颜一帆, 陆逸凡, 杨珉. 通用深度学习语言模型的隐私风险评估[J]. 计算机研究与发展, 2021, 58(5): 1092-1105. CSTR: 32373.14.issn1000-1239.2021.20200908
Pan Xudong, Zhang Mi, Yan Yifan, Lu Yifan, Yang Min. Evaluating Privacy Risks of Deep Learning Based General-Purpose Language Models[J]. Journal of Computer Research and Development, 2021, 58(5): 1092-1105. CSTR: 32373.14.issn1000-1239.2021.20200908
Citation: Pan Xudong, Zhang Mi, Yan Yifan, Lu Yifan, Yang Min. Evaluating Privacy Risks of Deep Learning Based General-Purpose Language Models[J]. Journal of Computer Research and Development, 2021, 58(5): 1092-1105. CSTR: 32373.14.issn1000-1239.2021.20200908

通用深度学习语言模型的隐私风险评估

基金项目: 国家自然科学基金项目(61972099,U1636204,U1836213,U1836210,U1736208);上海市自然科学基金项目(19ZR1404800)
详细信息
  • 中图分类号: TP309

Evaluating Privacy Risks of Deep Learning Based General-Purpose Language Models

Funds: This work was supported by the National Natural Science Foundation of China (61972099, U1636204, U1836213, U1836210, U1736208) and the Natural Science Foundation of Shanghai (19ZR1404800).
  • 摘要: 近年来,自然语言处理领域涌现出多种基于Transformer网络结构的通用深度学习语言模型,简称“通用语言模型(general-purpose language models, GPLMs)”,包括Google提出的BERT(bidirectional encoder representation from transformers)模型等,已在多个标准数据集和多项重要自然语言处理任务上刷新了最优基线指标,并已逐渐在商业场景中得到应用.尽管其具有很好的泛用性和性能表现,在实际部署场景中,通用语言模型的安全性却鲜为研究者所重视.近年有研究工作指出,如果攻击者利用中间人攻击或作为半诚实(honest-but-curious)服务提供方截获用户输入文本经由通用语言模型计算产生的文本特征,它将以较高的准确度推测原始文本中是否包含特定敏感词.然而,该工作仅采用了特定敏感词存在与否这一单一敏感信息窃取任务,依赖一些较为严格的攻击假设,且未涉及除英语外其他语种的使用场景.为解决上述问题,提出1条针对通用文本特征的隐私窃取链,从更多维度评估通用语言模型使用中潜在的隐私风险.实验结果表明:仅根据通用语言模型提取出的文本表征,攻击者能以近100%的准确度推断其模型来源,以超70%的准确度推断其原始文本长度,最终推断出最有可能出现的敏感词列表,以重建原始文本的敏感语义.此外,额外针对3种典型的中文预训练通用语言模型开展了相应的隐私窃取风险评估,评估结果表明中文通用语言模型同样存在着不可忽视的隐私风险.
    Abstract: Recently, a variety of Transformer-based GPLMs (general-purpose language models), including Google’s BERT (bidirectional encoder representation from transformers), are proposed in NLP (natural language processing). GPLMs help achieve state-of-the-art performance on a wide range of NLP tasks, and are applied in industrial applications. Despite their generality and promising performance, a recent research work first shows that an attacker, who has access to the textual embeddings produced by GPLMs, can infer whether the original text contains a specific keyword with high accuracy. However, the previous work has the following limitations. First, they only consider the occurrence of one sensitive word as the sensitive information to steal, which is still far from a threatening privacy violation. Besides, their attack requires several rather strict security assumptions on the attacker’s capability, e.g., the attacker knows which GPLM produces the victim’s textual embeddings. Moreover, they only consider the GPLMs designed for English texts. To address the aforementioned limitations and serve as a complement to their work, this paper proposes a more comprehensive privacy theft chain which is designed to explore whether there are even more privacy risks in general-purpose language models. Via experiments on 13 commercial GPLMs, we empirically show that an attacker can step by step infer the GPLM type behind the textual embedding with near 100% accuracy, then infer the textual length with over 70% on average and finally probe sensitive words that possibly occur in the original text, which brings useful information for the attacker to finally reconstruct the sensitive semantics. Besides, this paper also evaluates the privacy risks of three typical general-purpose language models in Chinese. The results confirm that privacy risks also exist in Chinese general-purpose language models, which calls for mitigation studies in the future.
  • 期刊类型引用(12)

    1. 杨兴耀,肖瑞,卢进堂. 新疆维吾尔语口音普通话短文的语音识别研究. 东北师大学报(自然科学版). 2024(04): 72-80 . 百度学术
    2. 闫凯,宋烨,刘瑜,杨莉,张浩源. 老龄化背景下居家养老系统方言识别算法应用研究——以粤语为例. 信息与电脑(理论版). 2023(02): 120-122 . 百度学术
    3. 蒋若怡,韦永壮,王慧娇. 基于深度学习的差分神经区分器求解方法. 计算机工程与设计. 2023(06): 1629-1634 . 百度学术
    4. 赵建川,杨浩铨,徐勇,吴恋,崔忠伟. 基于对比预测编码模型的多任务学习语种识别方法. 数据采集与处理. 2022(02): 288-297 . 百度学术
    5. 万苗,任杰,马苗,曹瑞. 多任务学习在中国方言分类中的应用研究. 计算机技术与发展. 2022(04): 109-115 . 百度学术
    6. 郝焕香. 基于深度学习的方言语音识别模型构建. 自动化与仪器仪表. 2022(04): 48-51 . 百度学术
    7. 王瑶,龙华,邵玉斌,杜庆治. 可变时长的短时广播语音多语种识别. 云南大学学报(自然科学版). 2022(03): 490-496 . 百度学术
    8. 付英,刘增力,汤辉. 基于CNN-BiGRU的方言语种识别. 通信技术. 2022(06): 712-719 . 百度学术
    9. 王瑶,龙华,邵玉斌,杜庆治,王延凯. 基于CRNN混合神经网络的多语种识别. 光电子·激光. 2022(06): 620-628 . 百度学术
    10. 张允耀,黄鹤鸣,张会云. 复杂噪声环境下语音识别研究. 计算机与现代化. 2021(09): 68-74 . 百度学术
    11. 辛强伟,唐云凯. 多维度数据组合的人工智能系统性能优化分析. 数字技术与应用. 2020(10): 36-38 . 百度学术
    14. 顾佳,黄明,关岳. 高速列车牵引变流器故障诊断研究. 振动.测试与诊断. 2020(05): 997-1002+1029 . 百度学术

    其他类型引用(15)

计量
  • 文章访问数:  1069
  • HTML全文浏览量:  5
  • PDF下载量:  619
  • 被引次数: 27
出版历程
  • 发布日期:  2021-04-30

目录

    /

    返回文章
    返回