Abstract:
Nowadays similarity query about data stream has been essential in many applications, like smart home and environmental monitoring. However, few of the current relevant researches take LCSS (longest common subsequence) as the similarity measurement function. The NAIVE algorithm gets the query results by comparing the threshold and the value of measurement function which is obtained based on the basic dynamic programming method. The similarity query over data stream based on the LCSS is considered in this paper. The D2S-PC algorithm is proposed to overcome the drawback that the query result cannot be gotten until the calculations on all the elements in the full dynamic programming matrix are finished. It defines the PS and CC domains of the matrix over every window, and utilizes the characteristics of the similarity query and matrix members in these two domains effectively. By taking this algorithm, the similarity query results can be obtained before the final length of LCSS is calculated. Compared with the original algorithm, it reduces the computations about the members in the matrix greatly. Extensive experiments on real and synthetic datasets show that the D2S-PC algorithm is effective in handling the similarity query over data stream based on the LCSS in the condition of more precise query results, and can meet the requirements of practical applications.