Denoising Autoencoder-Based Language Feature Compensation
-
摘要: 在语种识别中,当训练语音与测试语音长度失配时,系统的识别性能会出现严重下降.基于降噪自动编码器(denoising auto-encoder, DAE)的方法对不同长度测试语音的语种特征进行补偿,把不同长度的语音特征都映射为固定长度的语音特征,一定程度上解决了长度失配和音素分配不平衡的问题.具体分为4个环节:1)语音信号经过分帧、变换得到底层声学特征;2)提取语音信号的原始i-vector,同时计算其音素向量;3)对原始i-vector和音素向量进行拼接,送入基于DAE的语种特征补偿处理单元得到补偿后的i-vector;4)将补偿后的i-vector和原始i-vector分别送入后端分类器得到2个分数向量,并将其在得分域融合后进行判决.在NIST-LRE07上的实验结果表明:所提出的语种特征补偿算法在各种测试语音时长上的识别性能均有提升.相比传统的语种识别系统,测试语音时长为30 s时性能相对提升3.16%,测试语音时长为10 s时性能相对提升2.90%.相比端到端语种识别系统,测试语音时长为3 s时性能相对提升3.21%.Abstract: Language identification (LID) accuracy is often significantly reduced when the duration of the test data and the training data are mismatched. This paper proposes a method to compensate language features using a denoising autoencoder (DAE). Use of denoising autoencoder-based language feature compensation can map language features from variable length utterances into a fixed length representation. Therefore the problem of length mismatch and unbalanced phoneme distribution can be mitigated. The algorithm first converts the speech signal to low level acoustic features by framing and transforming, and then estimates its i-vector and phonetic vector. These two vectors are then concatenated and fed into the DAE-based language feature compensation processing unit. The compensated i-vector from the output of the DAE, and the original i-vector, are presented to the back-end classifier to obtain two score vectors. These two score vectors are finally fused at a score level to obtain a final result. Tests on NIST-LRE07 demonstrate that this feature compensation method improves identification performance over various test speech durations. Compared with traditional LID systems, the performance for 30 s test utterances improves by 3.16%, while the performance for 10 s test utterances improves by 2.90%. Compared with the end-to-end LID system, the performance on 3 s test utterances is increased by 3.21%.
-
-
期刊类型引用(11)
1. 张卓伦,袁帅鹏,李铁克,张文新. 基于两级决策树模型的轧制时间预测方法. 计算机集成制造系统. 2025(01): 197-210 . 百度学术
2. 王敏,王涛,叶志勇. 脉冲电磁阀的集成分类器故障诊断方法. 液压与气动. 2024(03): 174-180 . 百度学术
3. 蔡增玉,韩洋,张建伟,江楠,冯媛. 基于SnowNLP的微博网络舆情分析系统. 科学技术与工程. 2024(13): 5457-5464 . 百度学术
4. 孟祥福,任全莹,杨东燊,李可千,姚克宇,朱彦. 基于BERT和CNN的药物不良反应个例报道文献分类方法. 计算机科学. 2024(S1): 1116-1121 . 百度学术
5. 柴旭清,乔一航,范黎林. 一种基于随机森林分类器构建高性能应用程序性能分析模型的方法. 计算机工程与科学. 2024(07): 1218-1228 . 百度学术
6. 邬伟骏,吴江波,周强,姜文兵. 基于贝叶斯推理的风电机组风轮偏航协同智能控制方法. 可再生能源. 2024(09): 1205-1210 . 百度学术
7. 吕慧,段素芬. 基于深度学习的学位论文质量评价分析. 电子技术. 2023(04): 118-120 . 百度学术
8. 徐苗,王慧玲,梁义,綦小龙,高阳. 一种基于两步搜索策略的K2改进算法. 计算机科学. 2023(09): 303-310 . 百度学术
9. 陈晓姗,张国华. 基于朴素贝叶斯的大数据模糊随机挖掘仿真. 计算机仿真. 2023(11): 428-432 . 百度学术
10. 孔德越,程默,颜颖,吕晓艳. 基于铁路旅客常住地与行程环的年度出行特征分析体系. 中国铁道科学. 2022(05): 132-145 . 百度学术
11. 胡立伟,吕一帆,赵雪亭,薛宇,张成杰,雷国庆,刘凡. 基于数据驱动的交通事故伤害程度影响因素及其耦合关系研究. 交通运输系统工程与信息. 2022(05): 117-124+134 . 百度学术
其他类型引用(28)
计量
- 文章访问数: 934
- HTML全文浏览量: 10
- PDF下载量: 309
- 被引次数: 39