ISSN 1000-1239 CN 11-1777/TP

Journal of Computer Research and Development ›› 2018, Vol. 55 ›› Issue (12): 2611-2619.doi: 10.7544/issn1000-1239.2018.20180575

Special Issue: 2018碎片化知识融合与应用专题

Previous Articles     Next Articles

Clustering Ensemble Algorithm with Cluster Connection Based on Wisdom of Crowds

Zhang Hengshan1,2, Gao Yukun1, Chen Yanping1,2, Wang Zhongmin1,2   

  1. 1(School of Computer Science & Technology, Xi’an University of Posts and Telecommunications, Xi’an 710121);2(Shaanxi Key Laboratory of Network Data Analysis and Intelligent Processing (Xi’an University of Posts and Telecommunications), Xi’an 710121)
  • Online:2018-12-01

Abstract: The accuracy and stability of clustering will be obviously improved when a lot of independent clustering results for the same data set are aggregated by utilizing the principle of wisdom of crowds. In this paper, clustering ensemble algorithm with cluster connection based on wisdom of crowds (CECWOC) is proposed. Firstly, the independent clustering results are produced by the different clustering algorithms, which is guided by utilizing the independency, decentralization, diversity of wisdom of crowds. Secondly, the clustering ensemble algorithm based on connecting triple is developed to grouping aggregate the produced independent clusters, and the obtained results are aggregated again and the final cluster set is produced. The advantages of proposed algorithm are that: 1)The produced clusters by base clustering is grouping aggregated and weights of clusters are adjusted so that the selection of clusters is avoided, as a result, information on the produced clusters are not ignored; 2)Similarities of data are computed by using connected triple algorithm, the relations of data that their similarities are zero can be used. The experimental results at the different data sets show that the proposed algorithm can obtain the more accurate and stable results than other clustering ensemble algorithms, including the ones based on framework of wisdom of crowds.

Key words: wisdom of crowds (WOC), clustering ensemble, connecting triple, clustering ensemble select (CES), data mining

CLC Number: