高级检索

    基于FP树的全局最大频繁项集挖掘算法

    Algorithms of Mining Global Maximum Frequent Itemsets Based on FP-Tree

    • 摘要: 挖掘最大频繁项集是多种数据挖掘应用中的关键问题.在以往的最大频繁项集挖掘算法中,为了更新最大频繁候选项集集合,需要反复地扫描整个数据库,而且大部分算法是单机算法,全局最大频繁项集挖掘算法并不多见.为此提出MGMF算法,该算法利用FP-树结构,类似FP-树挖掘方法,一遍就可以挖掘出所有的最大频繁项集,并且超集检测非常简单、快捷.另外MGMF算法采用了分布式PDDM算法播报消息的思想,具有很好的拓展性和并行性.实验证明MGMF算法是有效可行的.

       

      Abstract: Mining maximum frequent itemsets is a key problem in data mining field with numerous important applications. The present algorithms need scanning the database many times for updating the set of maximum frequent itemsets and are based on local databases. The algorithms of mining global maximum frequent itemsets are very few. Therefore, an algorithm for mining global maximum frequent itemsets is proposed, which can conveniently get all global maximum frequent itemsets using FP-tree structure by one time mining, and superset checking is very simple and speedy. FP-tree structure has provided a kind of convenient depth-first mining method. The algorithm combines FP-tree with restrained sub-tree for mining global maximum frequent itemsets and adopts an efficient distributed PDDM algorithm for broadcasting itemsets information and improves the expansibility and the concurrence. The PDDM algorithm is based on previous DDM algorithm and improves I?O problem and communication of previous distributed algorithms. Experimental results testify the feasibility and effectiveness of the algorithm.

       

    /

    返回文章
    返回