Abstract:
FP-growth is a high performance algorithm for mining frequent patterns at present, but it can't acquire high efficiency when it is applied to maximal frequent patterns (MFPs) mining. The cause of low efficiency is analyzed and according to the analysis an algorithm, SFP-Max, is presented. The main idea of this algorithm is that, (1) It is a sorted FP-tree based algorithm for mining MFPs. (2) The properties of MFPs are applied to reduce the size of MFI candidates. (3) A temporary set is added to reduce the size of initial test itemsets, so that the time consuming for candidates test can be reduced. In the performance study, SFP-Max is compared with MAFIA, one of the most efficient algorithms for MFPs' mining. The empirical results show that SFP-Max is an efficient algorithm, it has comparable performance with MAFIA, and in most cases it outperforms MAFIA.