ISSN 1000-1239 CN 11-1777/TP

Journal of Computer Research and Development ›› 2016, Vol. 53 ›› Issue (5): 1029-1042.doi: 10.7544/issn1000-1239.2016.20148428

Previous Articles     Next Articles

A Dynamic Data Stream Clustering Algorithm Based on Probability and Exemplar

Bi Anqi1, Dong Aimei1,2, Wang Shitong1   

  1. 1(School of Digital Media, Jiangnan University, Wuxi, Jiangsu 214122); 2(School of Information, Qilu University of Technology, Jinan 250353)
  • Online:2016-05-01

Abstract: We propose an efficient probability drifting dynamic α-expansion clustering algorithm, which is designed for data stream clustering problem. In this paper, we first develop a unified target function of both affinity propagation (AP) and enhanced α-expansion move (EEM) clustering algorithms, namely the probability exemplar-based clustering algorithm. Then a probability drifting dynamic α-expansion (PDDE) clustering algorithm has been proposed considering the probability framework. The framework is capable of dealing with data stream clustering problem when current data points are similar with pervious data points. In the process of clustering, the proposed algorithm ensures that the clustering result of current data points is at least comparable well with that of previous data points. What’s more, the proposed algorithm is capable of dealing with two kinds of similarities between current and previous data points, that is whether current data points share some points with previous data points or not. Besides, experiments based on both synthetic (D31, Birch 3) and real-world dataset (Forest Covertype, KDD CUP99) have indicated the capability of PDDE in clustering data streams. The advantage of the proposed clustering algorithm in contrast to both AP and EEM algorithms has been shown as well.

Key words: data stream, energy function, probability, optimization algorithm, dynamic clustering

CLC Number: