• 中国精品科技期刊
  • CCF推荐A类中文期刊
  • 计算领域高质量科技期刊T1类
Advanced Search
Fu Junjie, Liu Gongshen. A GEV-Based Classification Algorithm for Imbalanced Data[J]. Journal of Computer Research and Development, 2018, 55(11): 2361-2371. DOI: 10.7544/issn1000-1239.2018.20170514
Citation: Fu Junjie, Liu Gongshen. A GEV-Based Classification Algorithm for Imbalanced Data[J]. Journal of Computer Research and Development, 2018, 55(11): 2361-2371. DOI: 10.7544/issn1000-1239.2018.20170514

A GEV-Based Classification Algorithm for Imbalanced Data

More Information
  • Published Date: October 31, 2018
  • The problem of binary classification with imbalanced data appears in many fields and is still not completely solved. In addition to predicting the classification label directly, many applications also care about the probability that data belongs to a certain class. However, much of the existing research is mainly focused on the classification performance but neglects the probability estimation. The aim of this paper is to improve the performance of class probability estimation (CPE) and ensure the classification performance. A new approach of regression is proposed by adopting the generalized linear model as the basic framework and using the calibration loss function as the objective optimization function. Considering the asymmetry and the flexibility of the generalized extreme value (GEV) distribution, we use it to formulate the link function, which contributes to binary classification with imbalanced data. As to the model estimation, because of the significant influence of the shape parameter on modeling precision, two methods to estimate the shape parameter in GEV distribution are proposed. Experiments on synthetic datasets prove the accuracy of the shape parameter estimation. Besides, experimental results on real data suggest that our proposed approach, compared with other three commonly used regression algorithms, performs well on the classification performance as well as CPE. In addition, the proposed algorithm also outperforms other optimization algorithms in terms of the computational efficiency.
  • Cited by

    Periodical cited type(4)

    1. 吴雄韬. 多重极值数据有效修正的测评公式构建与实证应用. 衡阳师范学院学报. 2024(06): 32-36 .
    2. 沈文杰. 基于机器学习的非平衡环境下多目标智能检测算法. 宁夏师范学院学报. 2023(01): 83-89+112 .
    3. 张镝,吴宇强. 基于机器学习的迭代式数据均衡分区算法研究. 微型电脑应用. 2023(12): 12-15 .
    4. 王金焱. 分布式网络混合云数据分类捕获方法研究. 安阳工学院学报. 2020(06): 59-62+74 .

    Other cited types(4)

Catalog

    Article views (973) PDF downloads (485) Cited by(8)
    Turn off MathJax
    Article Contents

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return