Abstract:
The apriori algorithm has become a classic method for mining association rules. The difficulties and operation quantity of the apriori algorithm consist of the following two aspects: (1) how to generate candidate frequent itemsets and to calculate its support, (2) how to reduce the size of candidate frequent itemsets and times of accessing I/O. At present, there are many methods that can solve the second problems very well. However, very few methods have been presented to solve the first problem. An efficient and fast algorithm based on binary format for discovering candidate frequent itemsets and calculating the support of itemsets is proposed, which only executes some logical operation. A performance comparison of this algorithm with the apriori-like algorithms is given,and the experiments show that the new algorithm is more efficient.