• 中国精品科技期刊
  • CCF推荐A类中文期刊
  • 计算领域高质量科技期刊T1类
Advanced Search
Jiang Gaoxia, Wang Wenjian. A Numerical Label Noise Filtering Algorithm for Regression Task[J]. Journal of Computer Research and Development, 2022, 59(8): 1639-1652. DOI: 10.7544/issn1000-1239.20220053
Citation: Jiang Gaoxia, Wang Wenjian. A Numerical Label Noise Filtering Algorithm for Regression Task[J]. Journal of Computer Research and Development, 2022, 59(8): 1639-1652. DOI: 10.7544/issn1000-1239.20220053

A Numerical Label Noise Filtering Algorithm for Regression Task

Funds: This work was supported by the National Natural Science Foundation of China (U21A20513, 62076154, 61906113, U1805263), the Key Research and Development Program of Shanxi Province International Cooperation (201903D421050), and the Scientific and Technological Innovation Programs of Higher Education Institutions in Shanxi (2020L0007).
More Information
  • Published Date: July 31, 2022
  • Numerical label noise in regression may misguide the model training and weaken the generalization ability. As a popular technique, noise filtering could reduce the noise level by removing mislabeled samples, but it could rarely ensure a better generalization performance. Some filters care about the noise level so much that many noise-free samples are also removed. Although the existing sample selection framework could balance the number of removals and the noise level, it is too complicated to be understood intuitively and applied in reality. A generalization error bound is proposed for data with numerical label noise according to the learning theory in the noise-free regression task. It clarifies the key data factors, including data size and noise level, that affect the generalization ability. On this basis, an interpretable noise filtering framework is proposed, the goal of which is to minimize the noise level with a low cost of sample removal. Meanwhile, the relationship between noise and key indicators (center and radius) of the covering interval is theoretically analyzed for noise estimation. Then a relative noise estimator is proposed. The relative noise filtering (RNF) algorithm is designed by integrating the proposed framework with the estimator. The effectiveness of RNF is verified on the benchmark datasets and age estimation dataset. Experimental results show that RNF can be adapted to various types of noises and significantly improve the generalization ability of the regression model. On the age estimation dataset, RNF detects some samples with label noises. It effectively improves the data quality and model prediction performance.
  • Related Articles

    [1]Zhang Naizhou, Cao Wei, Zhang Xiaojian, Li Shijun. Conversation Generation Based on Variational Attention Knowledge Selection and Pre-trained Language Model[J]. Journal of Computer Research and Development. DOI: 10.7544/issn1000-1239.202440551
    [2]Wang Honglin, Yang Dan, Nie Tiezheng, Kou Yue. Attributed Heterogeneous Information Network Embedding with Self-Attention Mechanism for Product Recommendation[J]. Journal of Computer Research and Development, 2022, 59(7): 1509-1521. DOI: 10.7544/issn1000-1239.20210016
    [3]Cheng Yan, Yao Leibo, Zhang Guanghe, Tang Tianwei, Xiang Guoxiong, Chen Haomai, Feng Yue, Cai Zhuang. Text Sentiment Orientation Analysis of Multi-Channels CNN and BiGRU Based on Attention Mechanism[J]. Journal of Computer Research and Development, 2020, 57(12): 2583-2595. DOI: 10.7544/issn1000-1239.2020.20190854
    [4]Wei Zhenkai, Cheng Meng, Zhou Xiabing, Li Zhifeng, Zou Bowei, Hong Yu, Yao Jianmin. Convolutional Interactive Attention Mechanism for Aspect Extraction[J]. Journal of Computer Research and Development, 2020, 57(11): 2456-2466. DOI: 10.7544/issn1000-1239.2020.20190748
    [5]Chen Yanmin, Wang Hao, Ma Jianhui, Du Dongfang, Zhao Hongke. A Hierarchical Attention Mechanism Framework for Internet Credit Evaluation[J]. Journal of Computer Research and Development, 2020, 57(8): 1755-1768. DOI: 10.7544/issn1000-1239.2020.20200217
    [6]Li Mengying, Wang Xiaodong, Ruan Shulan, Zhang Kun, Liu Qi. Student Performance Prediction Model Based on Two-Way Attention Mechanism[J]. Journal of Computer Research and Development, 2020, 57(8): 1729-1740. DOI: 10.7544/issn1000-1239.2020.20200181
    [7]Zhang Yingying, Qian Shengsheng, Fang Quan, Xu Changsheng. Multi-Modal Knowledge-Aware Attention Network for Question Answering[J]. Journal of Computer Research and Development, 2020, 57(5): 1037-1045. DOI: 10.7544/issn1000-1239.2020.20190474
    [8]Zhang Yixuan, Guo Bin, Liu Jiaqi, Ouyang Yi, Yu Zhiwen. app Popularity Prediction with Multi-Level Attention Networks[J]. Journal of Computer Research and Development, 2020, 57(5): 984-995. DOI: 10.7544/issn1000-1239.2020.20190672
    [9]Liu Ye, Huang Jinxiao, Ma Yutao. An Automatic Method Using Hybrid Neural Networks and Attention Mechanism for Software Bug Triaging[J]. Journal of Computer Research and Development, 2020, 57(3): 461-473. DOI: 10.7544/issn1000-1239.2020.20190606
    [10]Zhang Zhichang, Zhang Zhenwen, Zhang Zhiman. User Intent Classification Based on IndRNN-Attention[J]. Journal of Computer Research and Development, 2019, 56(7): 1517-1524. DOI: 10.7544/issn1000-1239.2019.20180648
  • Cited by

    Periodical cited type(1)

    1. 郑章财,徐锋. 嵌入式服务器软件接口通信容量调节算法仿真. 计算机仿真. 2024(04): 265-269 .

    Other cited types(0)

Catalog

    Article views (293) PDF downloads (166) Cited by(1)

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return