Credit Fraud Detection for Extremely Imbalanced Data Based on Ensembled Deep Learning

Liu Ying; Yang Ke

doi:10.7544/issn1000-1239.2021.20200324

Liu Ying, Yang Ke. Credit Fraud Detection for Extremely Imbalanced Data Based on Ensembled Deep LearningJ. Journal of Computer Research and Development, 2021, 58(3): 539-547. DOI: 10.7544/issn1000-1239.2021.20200324

Citation:

Credit Fraud Detection for Extremely Imbalanced Data Based on Ensembled Deep Learning

Liu Ying,
Yang Ke

Graphical Abstract

Abstract

Abstract

The existence of class imbalance in credit fraud data significantly undermines model performance. In particular, when the sample distribution is extremely unbalanced, noise caused by information distortion, statistical discrepancy and reporting bias will severely damage the process of training models, leading to potential issues such as overfitting. For this reason, this paper proposes an algorithm based on ensembled deep belief network, which is meant to tackle credit fraud data featured by extreme imbalance. First, this paper proposes joint sampling strategy combining under-sampling and over-sampling to retrieve training subset data. Then, we introduce an algorithm of constructing classifier clusters through two stages. Support vector classifiers and random forest classifiers are combined by using Boosting algorithm to overcome classification interface deviation of support vector machine. Finally, deep belief network is exploited to assemble classifiers’ predictions and output final classification result. Besides, traditional evaluation methods put too much emphasis on majority samples, ignoring the reality where the minority matters even more. The revenue cost index that considers identification of both positive and negative samples has thereby been introduced. This paper conducts empirical study on European credit card data and concludes a 3% higher performance on revenue cost index of the proposed algorithm than others’ average. The experiment also evaluates the influence of imbalance ratio over algorithm’s performance and finds that proposed algorithm outperforms others in this aspect.

FullText(HTML)

References (0)

Cited By

Turn off MathJax

Article Contents

Credit Fraud Detection for Extremely Imbalanced Data Based on Ensembled Deep Learning

Abstract

Catalog

Export File

Citation

Format

Content