Abstract:
With the extensive applications of data stream mining, the classification of concept-drifting data streams has become more and more important and challenging. Due to the characteristics of data streams with concept-drifting, an effective learner should be able to track such changes and to quickly adapt to them. A method named dynamic hierarchical ECOC algorithm based on incremental KnnModel (IKnnM-DHecoc) for handling the problem of concept drift is proposed. It divides a given data stream into several data blocks, and then learns from each data block by using incremental KnnModel algorithm. Based on the outcomes of pre-learning, a hierarchical tree together with a hierarchical coding matrix are built and updated, from which a chosen incremental learning method is used for training in order to build a set of classifier and a set of classifier candidates. Moreover, a pruning strategy for generated nodes of hierarchical tree is proposed to reduce computational cost by taking account of each nodes activity. In testing phase, a combination scheme of taking advantage of both IKnnModel and DHecoc is used for prediction. Experimental results show that the proposed IKnnM-DHecoc algorithm not only improves the dynamic nature of learning and classification performance, but could quickly adapt to the situation of concept drift.