Advanced Search
    Yang Bei, Huang Houkuan. Mining Top-K Significant Itemsets in Landmark Windows over Data Streams[J]. Journal of Computer Research and Development, 2010, 47(3): 463-473.
    Citation: Yang Bei, Huang Houkuan. Mining Top-K Significant Itemsets in Landmark Windows over Data Streams[J]. Journal of Computer Research and Development, 2010, 47(3): 463-473.

    Mining Top-K Significant Itemsets in Landmark Windows over Data Streams

    • Frequent itemset mining over data streams becomes a hot topic in data mining and knowledge discovery recently, which has been applied to different areas. However, the setting of a minimum support threshold needs some domain knowledge. It will bring many difficulties or much burden to users if the support threshold is not set reasonably. It is interesting for users to find top-K significant itemsets over data streams. A dynamic incremental approximate algorithm, TOPSIL-Miner, is presented to mine top-K significant itemsets in landmark windows. A new data structure, TOPSIL-Tree, is designed to store the potential significant itemsets, and other data structures of maximum support list, ordered item list, TOPSET and minimum support list are devised to maintain the information about mining results. Moreover, three optimization strategies are exploited to reduce the time and space cost of the algorithm: 1) pruning trivial nodes in the current data stream; 2) promoting mining support threshold during mining process heuristically and adaptively; and 3) promoting pruning threshold dynamically. The accuracy of the algorithm is also analyzed. Extensive experiments are performed to evaluate the good effectiveness, the high efficiency and precision of the algorithm.
    • loading

    Catalog

      Turn off MathJax
      Article Contents

      /

      DownLoad:  Full-Size Img  PowerPoint
      Return
      Return