基于特征约束和自适应损失平衡的机器遗忘方法

殷昱煜; 吴广强; 李尤慧子; 王鑫雨; 高洪皓

doi:10.7544/issn1000-1239.202440476

摘要: 随着数字化进程的加速推进，数据要素已成为现代社会运转的核心驱动力. 由于深度学习模型训练需要大量数据作为输入，其数据隐私保护问题也愈发重要. 机器遗忘技术使模型能够删除特定数据的影响，同时保持对剩余数据的泛化性能，为深度学习模型中的数据要素安全保护提供了有效的解决方案. 现有的机器遗忘方法主要分为精确遗忘和近似遗忘2类，但前者需要干预模型原始训练流程，后者则在遗忘效果和模型泛化能力之间难以找到平衡点. 为此，提出了一种基于特征约束和自适应损失平衡的近似遗忘框架. 首先，对于“遗忘”过程，使用同样未经过遗忘样本训练的随机模型作为教师来引导遗忘模型的特征输出，实现模型对数据要素在特征层面的遗忘. 然后，采用少量剩余数据进行微调训练，来“恢复”模型对于其他数据的泛化性能. 将上述机器遗忘框架视为一个多任务优化问题，在“遗忘”和“恢复”2个任务中引入自适应损失平衡，实现任务的稳步训练. 以卷积神经网络模型为例，在3个公开数据集上对比了UNSIR等多种基线方法，实验结果表明，该方法构建的遗忘模型不仅保障了机器遗忘效果，在剩余数据的准确率、时间开销、预测结果分布等指标上优于同类方法，更加有效地保护了模型的泛化性能.

Abstract: With the accelerated advancement of digitization, data elements have become the core driving force for the operation of modern society. However, at the same time, data security issues have become increasingly prominent, with frequent occurrences of data breaches and privacy violations, causing serious losses to individuals, organizations, and even countries. Against this backdrop, the security of data elements has become the focus of attention from all sectors of society, and the issue of data privacy protection in deep learning models has also attracted widespread attention. Among them, machine unlearning, as a key technology for protecting user’s privacy, aims to enable models to remove the influence of specific data while maintaining generalization performance for remaining data, providing an effective solution for protecting the security of data elements in deep learning models. Existing machine unlearning methods are mainly divided into two categories: exact unlearning and approximate unlearning. However, exact unlearning methods need to intervene in the original training process of the models, while approximate unlearning methods find it difficult to strike a balance between unlearning performance and model generalization ability. To address these issues, we propose an approximate unlearning framework based on feature constraints and adaptive loss balancing. We adopt a “forgetting-recovering” machine unlearning framework. First, for the “forgetting” process, in order to mimic the feature outputs of retrained models for the forgetting samples, we use a randomly initialized model that has not been trained on the forgetting samples to guide the feature outputs of the unlearning model, constraining forgetting at the feature level to avoid easily obtaining forgetting data information from the model. Then, a small amount of data is used for fine-tuning to “recover” the generalization performance of models on the remaining data. At the same time, we regard the above machine unlearning framework as a multi-task optimization problem and introduce adaptive loss balance to automatically balance the “forgetting” and “recovering” tasks, preventing the model from “over-forgetting” or “over-recovering”, so that the “forgetting” and “recovering” tasks can be trained relatively balanced and steadily. Extensive experiments on 3 image classification datasets show that our method can effectively forget the forgetting data and achieve optimal performance in multiple metrics.

基于特征约束和自适应损失平衡的机器遗忘方法

A Machine Unlearning Method via Feature Constraint and Adaptive Loss Balance