高级检索

    基于滑动窗口的数据流闭合频繁模式的挖掘

    Mining Frequent Closed Patterns from a Sliding Window over Data Streams

    • 摘要: 频繁闭合模式集惟一确定频繁模式完全集并且数量小得多,然而,如何挖掘滑动窗口中的频繁闭合模式集是一个很大的挑战.根据数据流的特点,提出了一种发现滑动窗口中频繁闭合模式的新方法DS_CFI. DS_CFI算法将滑动窗口分割为若干个基本窗口,以基本窗口为更新单位,利用已有的频繁闭合模式挖掘算法计算每个基本窗口的潜在频繁闭合项集,将它们及其子集存储到一种新的数据结构DSCFI_tree中,DSCFI_tree能够增量更新,利用DSCFI_tree可以快速地挖掘滑动窗口中的所有频繁闭合模式.最后,通过实验验证了这种方法的有效性.

       

      Abstract: The set of frequent closed patterns determines exactly the complete set of all frequent patterns and is usually much smaller than the latter. But how to mine frequent closed patterns from a sliding window is a very big challenge. According to the features of data streams, a new algorithm, call DS_CFI, is proposed to solve the problem of mining the frequent closed itemsets. A sliding window is divided into several basic windows and the basic window is served as an updating unit. Latency frequent closed itemsets of every basic window are mined by the existing frequent closed pattern algorithms. Those itemsets and their subset are stored in a new data structure called DSCFI_tree. The DSCFI_tree can be incrementally updated and the frequent closed itemsets in a sliding window can be rapidly found based on DSCFI_tree. The experimental results show the feasibility and effectiveness of the algorithm.

       

    /

    返回文章
    返回