高级检索

    面向合成语音检测的弹性正交权重修正连续学习方法

    Elastic Orthogonal Weight Modification for Continual Learning in the Context of Synthetic Speech Detection

    • 摘要: 目前,深度学习在合成语音检测领域取得了巨大的成功. 然而,通常情况下,深度模型可以在与训练集分布相似的测试集上取得高准确率,但在跨数据集的情境下,其准确率却会显著下降. 为了提高模型在新数据集上的泛化能力,通常需要对其进行微调,但这会导致模型遭受到灾难性遗忘. 灾难性遗忘指的是模型在新数据上的训练会损害其从旧数据中获得的知识,导致对旧数据的识别性能下降. 目前,克服灾难性遗忘的主要方法之一是连续学习. 面向合成语音检测提出了一种连续学习方法——弹性正交权重修正(elastic orthogonal weight modification,EOWM),用于克服灾难性遗忘. 该方法通过修正模型在学习新知识时的参数更新方向和更新幅度,以减少对已学知识的损害. 具体来说,该方法在模型学习新知识时要求参数的更新方向与旧任务的数据分布正交,并同时限制对旧任务中重要参数的更新幅度. 在合成语音检测领域的跨数据集实验中,方法取得了良好的效果. 与微调相比,该方法在旧数据集上的等错误率(equal error rate,EER)从7.334%降低至0.821%,相对下降了90%;在新数据集上的等错误率从0.513%降低至0.315%,相对下降了40%.

       

      Abstract: Currently, deep learning has achieved significant success in the field of synthetic speech detection. However, deep models commonly attain high accuracy on test sets that closely match their training distribution but exhibit a substantial drop in accuracy in cross-dataset scenarios. To enhance the generalization capability of models on new datasets, they are often fine-tuned with new data, but this leads to catastrophic forgetting, where the model's knowledge learned from old data is impaired, resulting in deteriorated performance on the old data. Continuous learning is a prevalent approach to mitigate catastrophic forgetting. In this paper, we propose a continuous learning method called Elastic Orthogonal Weight Modification (EOWM) to address catastrophic forgetting for synthetic speech detection. EOWM mitigates knowledge degradation by adjusting the direction and magnitude of parameter updates when the model learns new knowledge. Specifically, it enforces the updates' direction to be orthogonal to the data distribution of the old tasks while constraining the magnitude of updates for important parameters in the old tasks. Our proposed method demonstrates promising results in cross-dataset experiments within the domain of synthetic speech detection. Compared to fine-tuning, EOWM reduces the Equal Error Rate (EER) on the old dataset from 7.334% to 0.821%, representing a relative improvement of 90%, and on the new dataset, it decreases EER from 0.513% to 0.315%, corresponding to a relative improvement of 40%.

       

    /

    返回文章
    返回