基于自适应<i>k</i>近邻的时间序列异常模式识别

王玲; 周南; 申鹏

doi:10.7544/issn1000-1239.202111062

基于自适应k近邻的时间序列异常模式识别

Time Series Anomaly Pattern Recognition Based on Adaptive k Nearest Neighbor

摘要

摘要: 时间序列作为数据的典型代表，被广泛应用于许多研究领域. 时间序列异常模式代表了一种特殊情况的出现，在许多领域都具有重要意义. 现有的时间序列异常模式识别算法大多只是单纯检测异常子序列，忽略了异常子序列的类别区分问题，且许多参数都需要人为设置.为此提出了一种基于自适应k近邻的异常模式识别算法(anomaly pattern recognition algorithm based on adaptive k nearest neighbor, APAKN). 首先，确定各子序列的自适应k近邻值，引入自适应距离比计算子序列的相对密度,确定异常分数；然后提出一种基于最小方差的自适应阈值方法确定异常阈值，检测出所有异常子序列；最后，对异常子序列进行聚类，所得聚类中心即为具有不同变化趋势的异常模式. 整个算法过程在无需设置任何参数的情况下，不仅解决了密度不平衡问题，还精简了传统基于密度异常子序列检测算法的步骤，实现良好的异常模式识别效果. 在时间序列数据集合UCR的10个数据集上的实验结果表明, 提出算法在无需设置参数的情况下，在异常子序列检测和异常子序列聚类问题中都表现良好.

Abstract: As a typical representative of data, time series is widely used in many research fields. The time series anomaly pattern represents the emergence of a special situation, and is of great significance in many fields. Most of the existing time series anomaly pattern recognition algorithms simply detect anomaly subsequences, ignoring the problem of distinguishing the types of anomaly subsequences, and many parameters need to be set manually. In this paper, an anomaly pattern recognition algorithm based on adaptive k nearest neighbor(APAKN) is proposed. Firstly, the adaptive neighbor value k of each subsequence is determined, and an adaptive distance ratio is introduced to calculate the relative density of the subsequence to determine the anomaly score. Then, an adaptive threshold method based on minimum variance is proposed to determine the anomaly threshold and detect all anomaly subsequences. Finally, the anomaly subsequences are clustered, and the obtained cluster centers are anomaly patterns with different changing trends. The whole algorithm process not only solves the density imbalance problem without setting any parameters, but also simplifies the steps of the traditional density-based anomaly subsequence detection algorithm to achieve a good anomaly pattern recognition effect. Experimental results on the 10 data sets of UCR show that the proposed algorithm performs well in detecting anomaly subsequences and clustering anomaly subsequences without setting parameters.

HTML全文

参考文献(32)

施引文献

资源附件(0)