一种基于Kolmogorov-Smirnov检验的缺陷定位方法

叶  钢  余  丹  李重文  李先军  尹  杰  吕江花  马世龙

一种基于Kolmogorov-Smirnov检验的缺陷定位方法

叶钢余丹李重文李先军尹杰吕江花马世龙

Fault Localization Based on Kolmogorov-Smirnov Testing Model

Ye Gang, Yu Dan, Li Zhongwen, Li Xianjun, Yin Jie, Lü Jianghua, and Ma Shilong

摘要

摘要: 现有的基于中心极限定理和参数假设检验的方法被认为是一种高效的缺陷定位技术.然而，实验结果表明，在某些实验数据集上，测试用例的总数过小而不宜运用中心极限定理.实验结果同时表明，谓词的实际分布背离了基于参数假设检验的方法所假设的正态分布.基于以上发现，提出了一种基于Kolmogorov-Smirnov检验的缺陷定位方法.在西门子测试集和大型程序上的实验结果表明：该方法在小样本和非正态分布的样本集上具有较好的适用性.若谓词在某个测试用例执行时未被执行，已有的方法将该执行中此谓词的评估偏差值设为0.5.在西门子程序集上调查了该设置的有效性，实验结果表明：对于基于Kolmogorov-Smirnov检验的缺陷定位方法，该设置可以提高缺陷定位的效率.

Abstract: Software debugging is time-consuming and is often a bottleneck in the software development process. Techniques that can reduce the time required to locate faults can have a significant impact on the cost and quality of software development and maintenance. Among these techniques, the methods based on predicate evaluations have been shown to be promising for fault localization. Many existing statistical fault localization techniques based on predicate compare feature spectra of successful and failed runs. Some of these approaches test the similarity of the feature spectra through parametric hypothesis testing models based on the central limit theorem. However, our finding shows that precondition for the central limit theorem and assumption on feature spectra forming normal distributions are not well-supported by empirical data. This paper proposes a non-parametric approach, the Kolmogorov-Smirnov test, to measure the similarity of the feature spectra of successful and failed runs. We also compare our approach with SOBER (a method based on the parametric hypothesis testing model). The empirical results on the Siemens suite and large programs show that our approach can outperform SOBER, especially on small samples and non-normal distributions. If a predicate is never evaluated in a run, SOBER sets its evaluation bias to 0.5. In this paper, we also investigate the effectiveness of this setting for fault localization. The empirical results on the Siemens suite show that for the method based on Kolmogorov-Smirnov test, the performance with the setting of 0.5 is better than that without the setting for fault localization.

HTML全文

参考文献(0)

施引文献

资源附件(0)