Label Noise Robust Learning Algorithm in Environments Evolving Features

Zhang Zhenyu; Jiang Yuan

doi:10.7544/issn1000-1239.202330238

Journal of Computer Research and Development > 2023 > 60(8): 1740-1753. > DOI: 10.7544/issn1000-1239.202330238 CSTR: 32373.14.issn1000-1239.202330238

Zhang Zhenyu, Jiang Yuan. Label Noise Robust Learning Algorithm in Environments Evolving Features[J]. Journal of Computer Research and Development, 2023, 60(8): 1740-1753. DOI: 10.7544/issn1000-1239.202330238

Citation:

PDF (1161 KB)

Label Noise Robust Learning Algorithm in Environments Evolving Features

Zhang Zhenyu,
Jiang Yuan^,

Department of Computer Science and Technology, Nanjing University, Nanjing 210023

Funds: This work was supported by the National Natural Science Foundation of China (62176117)

More Information

Author Bio:
Zhang Zhenyu: born in 1994. PhD. His main research interests include machine learning and data mining

Jiang Yuan: born in 1976. PhD, professor, PhD supervisor. Her main research interests include artificial intelligence, machine learning, and data mining
Received Date: March 30, 2023
Revised Date: June 11, 2023
Available Online: June 27, 2023

Graphical Abstract

Abstract

Abstract

In real-world applications, data are often collected in the form of a stream, with features that can evolve over time. For instance, in the environmental monitoring task, features can be dynamically vanished or augmented due to the existence of expired old sensors and deployed new sensors. Additionally, besides the evolvable feature space, the labels potentially contain noise. When feature space evolves and data conceal inaccurate labels at the same time, it is quite challenging to design algorithms with guarantees, particularly theoretical understandings of generalization ability. To address this difficulty, we propose a new discrepancy measure for noisy labeled data with evolving feature space, named the label noise robust evolving discrepancy. Using this measure, we present the generalization error analysis, and the theory motivates the design of a learning algorithm which is further implemented by deep neural networks. Empirical studies on synthetic data confirm the rationale of our discrepancy measure and extensive experiments on real-world tasks validate the effectiveness of our algorithm.
- label noise,
- evolving feature space,
- weakly supervised learning,
- open-environment,
- robust learning

FullText(HTML)

References (41)

References

[1]	Zhou Zhihua. Open-environment machine learning[J]. National Science Review, 2022, 9(8): nwac123 doi: 10.1093/nsr/nwac123
[2]	Zhou Zhihua. A brief introduction to weakly supervised learning[J]. National Science Review, 2018, 5(1): 44−53 doi: 10.1093/nsr/nwx106
[3]	Hou Bojian, Zhang Lijun, Zhou Zhihua. Learning with feature evolvable streams[C] //Advances in Neural Information Processing Systems 30. Cambridge, MA: MIT, 2017: 1416−1426
[4]	Zhang Zhenyu, Zhao Peng, Jiang Yuan, et al. Learning with feature and distribution evolvable streams[C] //Proc of the 37th Int Conf on Machine Learning. New York: ACM, 2020: 11317−11327
[5]	Cesa-Bianchi N, Dichterman E, Fischer P, et al. Sample-efficient strategies for learning in the presence of noise[J]. Journal of the ACM, 1999, 46(5): 684−719 doi: 10.1145/324133.324221
[6]	Natarajan N, Dhillon I S, Ravikumar P K, et al. Learning with noisy labels[C] //Advances in Neural Information Processing Systems 26. Cambridge, MA: MIT, 2013: 1196−1204
[7]	Song H, Kim M, Lee J G. Selfie: Refurbishing unclean samples for robust deep learning[C] //Proc of the 36th Int Conf on Machine Learning. New York: ACM, 2019: 5907−5915
[8]	Ben-David S, Blitzer J, Crammer K, et al. Analysis of representations for domain adaptation[C] //Advances in Neural Information Processing Systems 19. Cambridge, MA: MIT, 2006: 137–144
[9]	Mansour Y, Mohri M, Rostamizadeh A. Domain adaptation: Learning bounds and algorithms[C] //Proc of the 22nd Conf on Learning Theory. New York: ACM, 2009: 18–29
[10]	Cortes C, Mohri M, Medina A M. Adaptation based on generalized discrepancy[J]. Journal of Machine Learning Research, 2019, 20(1): 1−30
[11]	Dietterich T G. Steps Toward Robust Artificial Intelligence[J]. AI Magazine, 2017, 38(3): 3−24
[12]	Zhou Zhihua. Learnware: On the future of machine learning[J]. Frontiers of Computer Science, 2016, 10(4): 589–590
[13]	Guan S U, Li Shanchun. Incremental learning with respect to new incoming input attributes[J]. Neural Processing Letters, 2001, 14: 241−260 doi: 10.1023/A:1012799113953
[14]	Zhang Qin, Zhang Peng, Long Guodong, et al. Online learning from trapezoidal data streams[J]. IEEE Transactions on Knowledge and Data Engineering, 2016, 28(10): 2709−2723 doi: 10.1109/TKDE.2016.2563424
[15]	刘艳芳,李文斌,高阳. 基于被动-主动的特征演化流学习[J]. 计算机研究与发展,2021,58(8):1575−1585 Liu Yanfang, Li Wenbin, Gao Yang. Passive-aggressive learning with feature evolvable streams[J]. Journal of Computer Research and Development, 2021, 58(8): 1575−1585 (in Chinese)
[16]	Hou Chenping, Zhou Zhihua. One-pass learning with incremental and decremental features[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 40(11): 2776−2792 doi: 10.1109/TPAMI.2017.2769047
[17]	刘兆清,古仕林,侯臣平. 面向特征继承性增减的在线分类算法[J]. 计算机研究与发展,2022,59(8):1668−1682 Liu Zhaoqing, Gu shilin, Hou Chenping. Online classification algorithm with feature inheritably increasing and decreasing[J]. Journal of Computer Research and Development, 2022, 59(8): 1668−1682 (in Chinese)
[18]	He Yi, Wu Baijun, Wu Di, et al. Online learning from capricious data streams: A generative approach[C] //Proc of the 28th Int Joint Conf on Artificial Intelligence. Macao, SAR China: Morgan Kautman, 2019: 2491–2497
[19]	Beyazit E, Alagurajah J, Wu Xingdong. Online learning from data streams with varying feature spaces[C] //Proc of the 33rd AAAI Conf on Artificial Intelligence. Menlo Park: CA: AAAI, 2019: 3232−3239.
[20]	Dong Jiahua, Cong Yang, Sun Gan, et al. Evolving metric learning for incremental and decremental features[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2021, 32(4): 2290−2302
[21]	Hou Bojian, Zhang Lijun, Zhou Zhihua. Prediction with unpredictable feature evolution[J]. IEEE Transactions on Neural Networks and Learning Systems, 2021, 33(10): 5706−5715
[22]	Angluin D, Laird P. Learning from noisy examples[J]. Machine Learning, 1988, 2: 343−370
[23]	Aslam J A, Decatur S E. On the sample complexity of noise-tolerant learning[J]. Information Processing Letters, 1996, 57(4): 189−195 doi: 10.1016/0020-0190(96)00006-3
[24]	Gao Wei, Wang Lu, Zhou Zhihua. Risk minimization in the presence of label noise[C] //Proc of the 30th AAAI Conf on Artificial Intelligence. Menlo Park, CA: AAAI, 2016: 1575−1581
[25]	Arora S, Ge Rong, Moitra A. Learning topic models-going beyond SVD[C] //Proc of the 53rd IEEE Annual Symp on Foundations of Computer Science.Piscataway, NJ: IEEE, 2012: 1−10
[26]	Liu Tongliang, Tao Dacheng. Classification with noisy labels by importance reweighting[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2016, 38(3): 447−461 doi: 10.1109/TPAMI.2015.2456899
[27]	Zhang Zhenyu, Zhao Peng, Jiang Yuan, et al. Learning from incomplete and inaccurate supervision[J]. IEEE Transactions on Knowledge and Data Engineering, 2022, 34(12): 5854−5868 doi: 10.1109/TKDE.2021.3061215
[28]	Scott C, Blanchard G, Handy G. Classification with asymmetric label noise: Consistency and maximal denoising[C] //Proc of the 26th Conf on Learning Theory. Berlin: Springer, 2013: 489−511
[29]	Ramaswamy H, Scott C, Tewari A. Mixture proportion estimation via kernel embeddings of distributions[C] //Proc of the 33rd Int Conf on Machine Learning. New York: ACM, 2016: 2052−2060
[30]	Sugiyama M, Nakajima S, Kashima H, et al. Direct importance estimation with model selection and its application to covariate shift adaptation[C] //Advances in Neural Information Processing Systems. Cambridge, MA: MIT, 2007: 1433–1440
[31]	Gretton A, Borgwardt K M, Rasch M J, et al. A kernel two-sample test[J]. Journal of Machine Learning Research, 2012, 13(1): 723−773
[32]	Mohri M, Muñoz-Medina A. New analysis and algorithm for learning with drifting distributions[C] //Proc of the 23rd Int Conf on Algorithmic Learning Theory. Berlin: Springer, 2012: 124−138
[33]	Menon A K, Rawat A S, Reddi S J, et al. Can gradient clipping mitigate label noise?[C/OL] //Proc of the 8th Int Conf on Learning Representations. 2020. https://openreview.net/forum?id=rklB76EKPr
[34]	Han Bo, Yao Quanming, Yu Xingrui, et al. Co-teaching: Robust training of deep neural networks with extremely noisy labels[C] //Proc of Advances in Neural Information Processing Systems 31, Cambridge, MA: MIT, 2018: 8536−8546
[35]	Kanamori T, Suzuki T, Sugiyama M. Theoretical analysis of density ratio estimation[J]. IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences, 2010, 93(4): 787−798
[36]	Huang Jiayuan, Gretton A, Borgwardt K, et al. Correcting sample selection bias by unlabeled data[C] //Advances in Neural Information Processing Systems. Cambridge, MA: MIT, 2006: 601−608
[37]	Kanamori T, Hido S, Sugiyama M. A least-squares approach to direct importance estimation[J]. Journal of Machine Learning Research, 2009, 10: 1391−1445
[38]	Ganin Y, Ustinova E, Ajakan H, et al. Domain-adversarial training of neural networks[J]. Journal of Machine Learning Research, 2016, 17(1): 2096−2030
[39]	Zhang Yuchen, Liu Tianle, Long Mingsheng, et al. Bridging theory and algorithm for domain adaptation[C] //Proc of the 36th Int Conf on Machine Learning. New York: ACM, 2019: 7404−7413
[40]	McAuley J, Targett C, Shi Qinfeng, et al. Image-based recommendations on styles and substitutes [C] //Proc of the 38th int ACM SIGIR Conf on Research and Development in Information Retrieval. New York: ACM, 2015: 43−52
[41]	Amini MR, Usunier N, Goutte C. Learning from multiple partially observed views-an application to multilingual text categorization[C] //Advances in Neural Information Processing Systems. Cambridge, MA: MIT, 2009: 28−36

[1]	He Xin, Gui Xiaolin, An Jian. A Distributed Area Coverage Algorithm Based on Delayed Awakening in Wireless Sensor Networks[J]. Journal of Computer Research and Development, 2011, 48(5): 786-792.
[2]	Xu Jia, Feng Dengguo, Su Purui. Research on Network-Warning Model Based on Dynamic Peer-to-Peer Overlay Hierarchy[J]. Journal of Computer Research and Development, 2010, 47(9): 1574-1586.
[3]	Xiong Wei, Xie Dongqing, Jiao Bingwang, Liu Jie. A Structured Peer to Peer File Sharing Model with Non-DHT Searching Algorithm[J]. Journal of Computer Research and Development, 2009, 46(3): 415-424.
[4]	Li Xiaolong, Lin Yaping, Hu Yupeng, Liu Yonghe. A Subset-Based Coverage-Preserving Distributed Scheduling Algorithm[J]. Journal of Computer Research and Development, 2008, 45(1): 180-187.
[5]	Hu Jinfeng, Hong Chunhui, Zheng Weimin. Granary: An Architecture of Object Oriented Internet Storage Service[J]. Journal of Computer Research and Development, 2007, 44(6): 1071-1079.
[6]	Zhang Sanfeng and Wu Guoxin. A Fault-Tolerant Asymmetric DHT Method Towards Dynamic and Heterogeneous Network[J]. Journal of Computer Research and Development, 2007, 44(6): 905-913.
[7]	Cao Jia, Lu Shiwen. Research on Topology Discovery in the Overlay Multicast[J]. Journal of Computer Research and Development, 2006, 43(5): 784-790.
[8]	Mao Yingchi, Liu Ming, Chen Lijun, Chen Daoxu, Xie Li. A Distributed Energy-Efficient Location-Independent Coverage Protocol in Wireless Sensor Networks[J]. Journal of Computer Research and Development, 2006, 43(2): 187-195.
[9]	Wen Yingyou, Zhao Jianli, Zhao Linliang, and Wang Guangxing. A Study of the Relationship Between Performance of Topology-Based MANET Routing Protocol and Network Coverage Density[J]. Journal of Computer Research and Development, 2005, 42(4): 684-689.
[10]	Zhou Jin and Li Yanda. A Peer-to-Peer DHT Algorithm Based on Small-World Network[J]. Journal of Computer Research and Development, 2005, 42(1): 109-117.

Cited By

Cited by

Periodical cited type(6)

1.	徐雪峰，郭广伟，黄余. 改进全卷积神经网络的遥感图像小目标检测. 机械设计与制造. 2024(10): 38-42 .
2.	刘雯雯，汪皖燕，程树林. 融合项目热门惩罚因子改进协同过滤推荐方法. 计算机技术与发展. 2023(03): 15-19 .
3.	冯勇，刘洋，王嵘冰，徐红艳，张永刚. 面向用户需求的生成对抗网络多样性推荐方法. 小型微型计算机系统. 2023(06): 1192-1197 .
4.	冯晨娇，宋鹏，张凯涵，梁吉业. 融合社交网络信息的长尾推荐方法. 模式识别与人工智能. 2022(01): 26-36 .
5.	韩迪，陈怡君，廖凯，林坤玲. 推荐系统中的准确性、新颖性和多样性的有效耦合与应用. 南京大学学报(自然科学). 2022(04): 604-614 .
6.	甘亚男，耿生玲，郝立. 超贝叶斯图模型及其联结树的构建. 青海师范大学学报(自然科学版). 2021(02): 42-48 .