ISSN 1000-1239 CN 11-1777/TP

计算机研究与发展 ›› 2017, Vol. 54 ›› Issue (12): 2649-2659.doi: 10.7544/issn1000-1239.2017.20170637

所属专题: 2017人工智能应用专题

• 人工智能 • 上一篇    下一篇

互补学习:一种面向图像应用和噪声标注的深度神经网络训练方法

周彧聪,刘轶,王锐   

  1. (北京航空航天大学中德联合软件研究所 北京 100191) (zjoe546@foxmail.com)
  • 出版日期: 2017-12-01
  • 基金资助: 
    国家重点研发计划项目(2016YFB0200100);国家自然科学基金项目(61732002)

Training Deep Neural Networks for Image Applications with Noisy Labels by Complementary Learning

Zhou Yucong, Liu Yi, Wang Rui   

  1. (Sino-German Joint Software Institute, Beihang University, Beijing 100191)
  • Online: 2017-12-01

摘要: 近几年来,深度神经网络在图像识别、语音识别、自然语言处理等众多领域取得了突破性的进展.互联网以及移动设备的快速发展极大地推进了图像应用的普及,也为深度神经网络的训练积累了大量数据.其中,大规模人工标注的数据是成功训练深度神经网络的关键.但随着数据规模的快速增长,人工标注图像的成本也越来越高,同时不可避免地产生标注错误,从而影响神经网络的训练.为此,提出了一种称为互补学习的方法,面向图像应用中深度神经网络的训练,将简单样本挖掘和迁移学习的思想相结合,利用少量人工标注的干净数据和大量带有噪声标注的数据,同时训练一主一辅2个深度神经网络模型,在训练过程中采用互补的策略分别选择部分样本进行学习,同时将辅模型的知识迁移给主模型,从而减少噪声标注对训练的影响.实验表明:提出的方法能有效地利用带有噪声标注的数据训练深度神经网络,并对比其他方法有一定的优越性,有较强的应用价值.

关键词: 深度神经网络, 图像应用, 噪声标注, 简单样本挖掘, 迁移学习

Abstract: In recent years, deep neural networks (DNNs) have made great progress in many fields such as image recognition, speech recognition and natural language processing, etc. The rapid development of the Internet and mobile devices promotes the popularity of image applications and provides a large amount of data to be used for training DNNs. Also, the manually annotated data is the key of training DNNs. However, with the rapid growth of data scale, the cost of manual annotation is getting higher and the quality is hard to be guaranteed, which will damage the performance of DNNs. Combining the idea of easy example mining and transfer learning, we propose a method called complementary learning to train DNNs on large-scale noisy labels for image applications. With a small number of clean labels and a large number of noisy labels, we jointly train two DNNs with complementary strategies and meanwhile transfer the knowledge from the auxiliary model to the main model. Through experiments we show that this method can efficiently train DNNs on noisy labels. Compared with current approaches, this method can handle more complicated noise labels, which demonstrates its value for image applications.

Key words: deep neural networks (DNNs), image applications, noisy labels, easy example mining, transfer learning

中图分类号: