Abstract:
Data stream mining is one of the hot topics of data mining and concept drifting detection is one of its research directions. There have been many methods to detect concept drifting, but there are some drawbacks in current methods to detect concept drifting, such as no reducing redundant attributes integrally in sliding windows, and detecting concept drifting according to outer properties, etc. Based on the basic principles of rough sets and F-rough sets, the sliding windows in a data stream are regarded as decision subsystems, and the attribute significance of conditional attributes is used to detect concept drifting. This new method is divided into two steps: the redundant attributes in a streaming data are reduced through parallel reducts at first, then the concept drifting is detected according to the change of attribute significance. Different from other existing methods, the inner properties of data stream are used to detect concept drifting. Experiments show that this method is valid to reduce redundant attributes integrally and detect concept drifting, and that the attribute significance based on the mutual information is more effective than the attribute significance based on the positive region when they are used to detect concept drifting. For data stream mining, this paper provides a new method to detect concept drifting. For rough set theory, this paper offers a new application area.