Abstract:
For time series classification, the classifier built by Shapelets has high classification accuracy and, meanwhile, the classification results are easily interpretable. Therefore, the extraction of discriminative Shapelets has attracted a lot of attention in the field of time series data mining. Research on Shapelets extraction has obtained promising achievement, but there are still some problems. The main reason is that the traversal of all time series subsequences to find the discriminative Shapelets is extraordinarily time consuming. Although some pruning techniques can be applied to accelerate the extraction process, they usually reduce the classification accuracy. In this paper, we propose a novel Shapelets extraction method based on similarity join, which abandons the idea of computing each subsequence’s discriminative power. In the proposed method, each subsequence is considered as a basic computing unit and the similarity vector of two time series is obtained by the similarity join calculation of their subsequences. For the time series with different class label, we compute the difference vector of each time series pair and merge them into a candidate matrix which represents the differences between different time series class. Thus, we can easily obtain the eligible Shapelets from the candidate matrix. Extensive experimental results in real time series datasets show that, compared with the exist Shapelets extraction methods, the proposed method has high time efficiency while ensuring excellent classification accuracy.