Abstract:
Mining frequent itemsets is a fundamental and essential problem in data mining application. Most of the proposed mining algorithms are a variant of Apriori. These algorithms show good performance with spare datasets. However, with dense datasets such as telecommunications and medical image data, where there are many long frequent itemsets, the performance of these algorithms degrades incredibly. In order to solve this problem, an efficient algorithm MFCIA and its updating algorithm UMFCIA for mining frequent closed itemsets are proposed. The set of frequent closed itemsets uniquely determines the exact frequency of all frequent itemsets, yet it can be orders of magnitude smaller than the set of all frequent itemsets, thus lowering the algorithm computation cost. The algorithm UMFCIA makes use of the previous mining results to cut down the cost of finding new frequent closed itemsets. The experiments show that the algorithm MFCIA is efficient.