Abstract:
The set of frequent closed patterns determines exactly the complete set of all frequent patterns and is usually much smaller than the latter. But how to mine frequent closed patterns from a sliding window is a very big challenge. According to the features of data streams, a new algorithm, call DS_CFI, is proposed to solve the problem of mining the frequent closed itemsets. A sliding window is divided into several basic windows and the basic window is served as an updating unit. Latency frequent closed itemsets of every basic window are mined by the existing frequent closed pattern algorithms. Those itemsets and their subset are stored in a new data structure called DSCFI_tree. The DSCFI_tree can be incrementally updated and the frequent closed itemsets in a sliding window can be rapidly found based on DSCFI_tree. The experimental results show the feasibility and effectiveness of the algorithm.