Abstract:
Mining generalized association rules is one of the important research areas in data mining. In real life applications, the transaction database is updated frequently. It makes the maintenance of generalized association rules one of challenging research work. In this paper, firstly, by analyzing and summarizing all the update cases of the taxonomy data, several special properties of updating are concluded; secondly, we project the transaction databases to a new compact structure called GECT (generalized extended canonical-order tree) composed of two header tables that can be used to mine the whole updated tree and the incremental tree. Thirdly, we propose an incremental updating algorithm GECT-IM, which finds most updated frequent itemsets by scanning the updating transactions set instead of the original database; To tackle the limit of GECT-IM, which still need scan the GECT when the infrequent itemsets become frequent, we propose a further optimized structure called PGECT (pre-large generalized extended canonical-order tree) and an efficient algorithm PGECT-IM. Within the certain updating scope, it can find all the updated frequent itemsets without rescanning the original PGECT. The experiments on synthetic datasets show that our algorithms, both GECT-IM and PGECT-IM, are not only correct and complete but also outperform the well-known and current algorithms.