高级检索

    基于抗测绘表征扰动的RAG敏感信息保护技术

    Anti-Mapping Representation Perturbation for RAG Sensitive Information Protection

    • 摘要: 检索增强生成(RAG)系统通过集成外部数据库扩展了语言模型的能力。然而,这种增强方式引入了一种新型隐私漏洞:测绘攻击(MA),它能揭示私有片段是否被索引及其检索方式。目前尚无专门防御此类攻击的策略。我们提出了AMRP-SIP框架,这种双重随机化方案能同时保护文档嵌入向量和检索轨迹,同时保持最先进的实用性。AMRP-SIP包含三个轻量级阶段:首先,通过正交投影将查询和文档压缩至低维潜在空间,隐藏原始嵌入向量并降低下游噪声;其次,自适应差分隐私注入自适应高斯噪声,确保实现(ε, δ)级别的文档片段隐私保护;最后,通过扰动丢弃层对相似度分数施加噪声扰动,并以概率p随机丢弃部分检索文档,从而模糊检索轨迹。在Wiki-40B、PubMed和IP-Database上的实验表明,AMRP-SIP将成员推理攻击(MIA)的AUC值从0.75降至0.27。该框架在保护敏感信息的同时,维持了与现有技术相当的检索性能,为RAG系统提供了首个针对测绘攻击的防御解决方案。

       

      Abstract: Retrieval-augmented generation (RAG) systems extend language model capacity by incorporating an external database. However, this augmentation introduces a novel privacy vulnerability: mapping attacks (MA), which reveal whether a private fragment is indexed and how it is retrieved. However, there is currently no defense strategy specifically designed to counter such attacks. We introduce AMRP-SIP, a dual-randomization framework that concurrently protects both embeddings of the documents and retrieval traces, while preserving state-of-the-art utility. AMRP-SIP comprises three lightweight stages. First, a Random Orthogonal Projection compresses each query and document into a low-dimensional latent space, hiding raw embeddings and reducing downstream noise. Second, Adaptive Differential Privacy injects cluster-adaptive Gaussian noise, ensuring (ε, δ) fragment-level privacy. Third, a score-dropout layer introduces randomness by perturbing similarity scores with noise and probabilistically dropping a portion of the retrieved documents with probability p, thereby obfuscating the retrieval trajectory. Experiments on Wiki-40B, PubMed, and IP-Database demonstrate that AMRP-SIP reduces the AUC of membership inference attacks (MIA) from 0.75 to 0.27.

       

    /

    返回文章
    返回