Abstract:
As a new type of data, streaming data has been applied in various application fields. Its fast, massive and continuous characteristics make single pass and accurate scanning become essential features of online learning. In the process of continuous generation of streaming data, concept drift often occurs. At present, the research on concept drift detection is relatively mature. However, in reality, the development of learning environment factors in different directions often leads to the diversity of concept drift class in streaming data, which brings new challenges to streaming data mining and online learning. To solve this problem, this paper proposes a concept drift class detection method based on time window (CD-TW). In this method, stack and queue are used to access the data, and window mechanism is used to learn streaming data in chunks. This method detects concept drift site by creating two basic site time windows which load historical data and current data respectively and comparing the distribution changes of the data contained in them. Then, a span time window loading partial data after drift site is created. The drift span is obtained by analyzing the distribution stability of the data in span time window, which is further used to judge the concept drift class. The results of experiment demonstrate that CD-TW can not only detect concept drift site accurately, but also show good performance in judging the class of concept drift.