高级检索

    基于IS\++-树模型的频繁模式挖掘

    Mining Frequent Patterns Based on IS\++-Tree Model

    • 摘要: IS-树是一种新型的全文存储索引模型.提出一种基于扩展IS\++-树模型的频繁模式挖掘算法 .和FP-growth方法一样,算法直接构造频繁项集,不进行Apriori算法所采用的代价很高的 候选集产生与测试操作.然而它比FP-树模型具有更多的优点:只需扫描一遍事务库;挖掘任 务只局部关联于一棵根树;动态更新性好,仅做增量变化.实验表明,其具有与FP-growth算 法相当甚至更高的效率.更重要的是,IS\++-树模型同时是一种事务库的良好索引形式,具 有高效支持事务查询的能力.

       

      Abstract: IS-tree is a novel mathematical model presented recently, which has been success fully applied to full-text index and storage in text database. In this paper, it s application is extended to data mining and an algorithm is presented for minin g frequent patterns based on IS\++-tree. The algorithm builds frequent patterns directly, as FP-growth algorithm does. However, it has several advantages over t he FP-tree model. Firstly, it scans the transaction database only once. Secondly , the mining process is only associated with one root tree. Thirdly, IS\++-tree can be dynamically updated by increments. The performance study shows that the a lgorithm efficiency is equal to or even higher than FP-growth. Above all, IS\++- tree is a good model to index transaction database, and it can support query on transactions with high efficiency.

       

    /

    返回文章
    返回