Abstract:
Maintaining a synopsis data structure dynamically from data stream is vital for a variety of streaming data applications, such as approximate query or data mining. In many cases, the significance of data item in streams decays with age: this item perhaps conveys critical information first, but, as time goes by, it gets less and less important until it eventually becomes useless. This feature is termed amnesic. Discrete wavelet transform is often used in construction of synopses for streaming data. Proposed in this paper is a wavelet-based hierarchical amnesic synopsis (W-HAS), which includes the amnesic feature of data stream in the generation of wavelet synopses. W-HAS can provide a better approximate representation for data streams with amnesic feature than conventional wavelet synopses. To maintain W-HAS online for evolving data streams, the authors first explore the merger process of two wavelet decompositions, and then implement the addition of data nodes in W-HAS structure based on the merger process. Using the addition of data nodes, W-HAS grows dynamically and hierarchically. The construction methods of W-HAS under sum of squared error (sse) and maximum absolute error metrics are discussed. Further, W-HAS with error control is also explore. Finally, experiments on real and synthetic datasets validated the proposed methods.