ISSN 1000-1239 CN 11-1777/TP

Journal of Computer Research and Development ›› 2015, Vol. 52 ›› Issue (5): 1071-1079.doi: 10.7544/issn1000-1239.2015.20140275

Previous Articles     Next Articles

Concept Drifting Detection for Categorical Evolving Data Based on Parallel Reducts

Deng Dayong1, Xu Xiaoyu1, Huang Houkuan2   

  1. 1(College of Mathematics, Physics and Information Engineering, Zhejiang Normal University, Jinhua, Zhejiang 321004); 2(School of Computer and Information Technology, Beijing Jiaotong University, Beijing 100044)
  • Online:2015-05-01

Abstract: Data stream mining is one of the hot topics of data mining and concept drifting detection is one of its research directions. There have been many methods to detect concept drifting, but there are some drawbacks in current methods to detect concept drifting, such as no reducing redundant attributes integrally in sliding windows, and detecting concept drifting according to outer properties, etc. Based on the basic principles of rough sets and F-rough sets, the sliding windows in a data stream are regarded as decision subsystems, and the attribute significance of conditional attributes is used to detect concept drifting. This new method is divided into two steps: the redundant attributes in a streaming data are reduced through parallel reducts at first, then the concept drifting is detected according to the change of attribute significance. Different from other existing methods, the inner properties of data stream are used to detect concept drifting. Experiments show that this method is valid to reduce redundant attributes integrally and detect concept drifting, and that the attribute significance based on the mutual information is more effective than the attribute significance based on the positive region when they are used to detect concept drifting. For data stream mining, this paper provides a new method to detect concept drifting. For rough set theory, this paper offers a new application area.

Key words: data streams, concept drift, rough sets, F-rough sets, parallel reducts

CLC Number: