Abstract:
Outliers detection is an important issue in data mining. It is difficult to find outliers in data streams because data streams are dynamic, one pass readable and of large amount of data. In this paper, a data stream outliers detection algorithm based on k-means partioning—DSOKP is proposed, which applies k means clustering on each partition of the data stream to generate mean reference point set, and subsequently picks out those potential outliers of each periods according to the definition of outliers. Theoretic analysis and experimental results indicate that DSOKP is effective and efficient.