ISSN 1000-1239 CN 11-1777/TP

Journal of Computer Research and Development ›› 2021, Vol. 58 ›› Issue (11): 2485-2499.doi: 10.7544/issn1000-1239.2021.20200523

Previous Articles     Next Articles

Space Transformation Based Random Forest Algorithm

Guan Xiaoqiang1, Wang Wenjian1,2, Pang Jifang1, Meng Yinfeng3   

  1. 1(School of Computer and Information Technology, Shanxi University, Taiyuan 030006);2(Key Laboratory of Computational Intelligence and Chinese Information Processing (Shanxi University), Ministry of Education, Taiyuan 030006);3(School of Mathematical Sciences, Shanxi University, Taiyuan 030006)
  • Online:2021-11-01
  • Supported by: 
    This work was supported by the National Natural Science Foundation of China (61876103, 61673249, U1805263, 62006148), the Key Research and Development Program of Shanxi Province (201903D421050), and the 1331 Engineering Project of Shanxi Province.

Abstract: Random forest is a commonly used classification algorithm in the field of machine learning, which has the advantages of wide application and not easy overfitting. In order to improve the overall performance of random forest in dealing with multi-classification problems, a space transformation based random forest algorithm (ST-RF) is proposed. Firstly, a priority class based linear discriminant analysis (PCLDA) method is designed. On the basis of obtaining the projection matrix for priority class, the discrimination effect between priority class samples and other classes samples is enhanced by spatial transformation. Then, PCLDA method is introduced into the process of random forest construction. By selecting the priority class randomly for each decision tree, the diversity among decision trees in random forests is guaranteed. By using the PCLDA method to create decision trees with different priority classes, the classification accuracy of individual decision tree is improved. Thus, the overall classification performance of the integrated model is effectively improved. By comparing the ST-RF algorithm with seven typical random forest algorithms in 10 standard datasets, the effectiveness of the proposed algorithm is verified. Moreover, the spatial transformation strategy based on PCLDA is applied to the above comparison algorithms, and the performance of the algorithms before and after adding the spatial transformation strategy are compared and analyzed. The experimental results show that ST-RF algorithm has obvious advantages in dealing with multi-classification problems, and the proposed spatial transformation strategy has strong universality, which can significantly improve the classification performance of the original algorithm.

Key words: random forest, priority class, linear discriminant analysis, space transformation, decision tree

CLC Number: