一种关于数据流区间Disjoint查询的快速处理算法

王少鹏  闻英友  李  志  赵  宏

一种关于数据流区间Disjoint查询的快速处理算法

王少鹏闻英友李志赵宏

A Fast Processing Algorithm on Section Disjoint Query of Data Stream

Wang Shaopeng, Wen Yingyou, Li Zhi, and Zhao Hong

摘要

摘要: 在关于数据流子序列相似性匹配的研究中，Disjoint查询是很重要的一类，在传感网络和数据挖掘等方面都发挥着非常重要的作用.但现有的研究并没有关注到定长区间上的Disjoint查询问题.直接对每个区间内成员使用Spring算法是解决该问题的NAIVE算法，但是因为NAIVE算法不具有增量计算的特点，所以存在冗余运算.针对NAIVE算法冗余运算的处理问题，提出了边界路径技术.边界路径技术很好地使用了Spring算法在相邻前一区间上的执行结果，使得Spring算法无需对当前区间上每个成员执行，就可以得到Disjoint查询在该区间的查询结果.使用该技术对NAIVE算法进行改造，设计并实现了快速区间Disjoint查询处理算法(fast section Disjoint query processing algorithm, FSDQ)，该算法具有增量计算的特点.实验证明FSDQ算法可以有效减少NAIVE算法所具有的冗余运算，是处理数据流上区间Disjoint查询的有效方法.

Abstract: Disjoint query has played a very important role in the research about similarity match over subsequence of data stream, such as sensor network, data mining and so on. But the existing research did not focus on the disjoint query which is executed for elements of data stream in the fixed-length section. Running Spring algorithm for all the elements in every section directly is the NAIVE algorithm. But this algorithm does not have the characteristic of incremental computation, so it has redundant operation. The boundary path technology is proposed in this paper to deal with this redundant operation. The NAIVE algorithm can get the disjoint query results of current section based on the ones of previous section and need not performing for every element in current section by using this technology, so it has the characteristic of incremental computation. The FSDQ algorithm, which has the characteristic of incremental computation, is proposed by improving NAIVE algorithm with the boundary path technology. Extensive experiments on real and synthetic data sets show that the FSDQ algorithm is an effective way to solve the problem of section disjoint query over data stream, and it can reduce redundant operations the NAIVE algorithm has and meet the requirements of practical application.

HTML全文

参考文献(0)

施引文献

资源附件(0)