Research on a New Concise Representation of Frequent Itemset and Its Mining Algorithm
-
Graphical Abstract
-
Abstract
Frequent itemset mining has become an important data mining task and a focused theme in data mining research. The bottlenecks of frequent itemset mining are as follows: On the one hand, the number of all frequent itemsets is usually extremely large. On the other hand, there is often severe redundancy in the resultant itemsets. To overcome these problems, recently several proposals have been made to construct a concise representation of the frequent itemsets, instead of mining all frequent itemsets. The aim is that the resultant subset can either satisfy the requirements of applications, or can derive all the other frequent itemsets. Maximal itemset and closed itemset are two most typical representative subsets of all frequent itemsets. Based on maximal itemset and closed itemset, a new concise representation of frequent itemset, namely meta itemset, is proposed. It is proved that both maximal itemset and closed itemset are special cases of meta itemset. The cardinality of meta itemset is between those of maximal itemset and closed itemset. Then, property of meta itemset is discussed. Finally, by introducing pruning strategies to DCI-closed-index, which is a closed itemset mining algorithm, an algorithm for mining meta itemset is proposed. Experimental results show that the proposed algorithm is effective and efficient.
-
-