ISSN 1000-1239 CN 11-1777/TP

Journal of Computer Research and Development

Previous Articles     Next Articles

Concept Drift Class Detection Based on Time Window

Guo Husheng 1,2, Ren Qiaoyan 1, Wang Wenjian 1,2   

  1. 1School of Computer and Information Technology, Shanxi University, Taiyuan 030006) 

    2Key Laboratory of Computational Intelligence and Chinese Information Processing(Shanxi University), Ministry of Education, Taiyuan 030006

  • Online:2021-02-05
  • Supported by: 
    This work was supported by the National Natural Science Foundation of China (61503229, 61673249, U1805263, 62076154), the Natural Science Foundation of Shanxi Province (201901D111033), and the Key Research and Development Program of Shanxi Province (International Cooperation, 201903D421050).

Abstract: As a new type of data, streaming data has been applied in various application fields. Its fast, massive and continuous characteristics make single pass and accurate scanning become essential features of online learning. In the process of continuous generation of streaming data, concept drift often occurs. At present, the research on concept drift detection is relatively mature. However, in reality, the development of learning environment factors in different directions often leads to the diversity of concept drift class in streaming data, which brings new challenges to streaming data mining and online learning. To solve this problem, this paper proposes a concept drift class detection method based on time window (CD-TW). In this method, stack and queue are used to access the data, and window mechanism is used to learn streaming data in chunks. This method detects concept drift site by creating two basic site time windows which load historical data and current data respectively and comparing the distribution changes of the data contained in them. Then, a span time window loading partial data after drift site is created. The drift span is obtained by analyzing the distribution stability of the data in span time window, which is further used to judge the concept drift class. The results of experiment demonstrated that the CD-TW can not only detect concept drift site accurately, but also show good performance in judging the class of concept drift.

Key words: streaming data; concept drift, time window, drift span, concept drift class