Abstract:
Software debugging is time-consuming and is often a bottleneck in the software development process. Techniques that can reduce the time required to locate faults can have a significant impact on the cost and quality of software development and maintenance. Among these techniques, the methods based on predicate evaluations have been shown to be promising for fault localization. Many existing statistical fault localization techniques based on predicate compare feature spectra of successful and failed runs. Some of these approaches test the similarity of the feature spectra through parametric hypothesis testing models based on the central limit theorem. However, our finding shows that precondition for the central limit theorem and assumption on feature spectra forming normal distributions are not well-supported by empirical data. This paper proposes a non-parametric approach, the Kolmogorov-Smirnov test, to measure the similarity of the feature spectra of successful and failed runs. We also compare our approach with SOBER (a method based on the parametric hypothesis testing model). The empirical results on the Siemens suite and large programs show that our approach can outperform SOBER, especially on small samples and non-normal distributions. If a predicate is never evaluated in a run, SOBER sets its evaluation bias to 0.5. In this paper, we also investigate the effectiveness of this setting for fault localization. The empirical results on the Siemens suite show that for the method based on Kolmogorov-Smirnov test, the performance with the setting of 0.5 is better than that without the setting for fault localization.