基于组合机器学习算法的软件缺陷预测模型

傅艺绮; 董威; 尹良泽; 杜雨晴

doi:10.7544/issn1000-1239.2017.20151052

基于组合机器学习算法的软件缺陷预测模型

Software Defect Prediction Model Based on the Combination of Machine Learning Algorithms

摘要

摘要: 软件缺陷预测是根据软件产品中提取的度量信息和已经发现的缺陷来尽早地预测软件可能还存在的缺陷，基于预测结果可合理分配测试和验证资源.基于机器学习的缺陷预测技术能够较全面地、自动地学习模型来发现软件中的缺陷，已经成为缺陷预测的主要方法.为了提高预测的效率和准确性，对机器学习算法的选择和研究是很关键的.对不同的机器学习缺陷预测方法进行对比分析，发现各算法在不同评价指标上有不同的优势，利用这些优势并结合机器学习中的stacking集成学习方法提出了将不同预测算法的预测结果作为软件度量并进行再次预测的基于组合机器学习算法的软件缺陷预测模型,最后用该模型对Eclipse数据集进行实验，表明了该模型的有效性.

Abstract: According to the metrics information and defects found in a software product, we can use software defect prediction technology to predict more defects that may also exist as early as possible, then testing and validation resources are allocated based on the prediction result appropriately. Defect prediction based on machine learning techniques can find software defects comprehensively and automatically, and it is becoming one of the main methods of current defect prediction technologies. In order to improve the efficiency and accuracy of prediction, selection and research of machine learning algorithms is the critical part. In this paper, we do comparative analysis to different machine learning defect prediction methods, and find that different algorithms have both advantages and disadvantages in different evaluation indexes. Taking these advantages, we refer to the stacking integration learning method and present a combined software defect prediction model. In this model, we first predict once, then add the prediction results of different methods in the original dataset as new software metrics, and then predict again. Finally, we make experiments on Eclipse dataset. Experimental results show that this model is technical feasibility, and can decrease the cost of time and improve the accuracy.

HTML全文

参考文献(0)

施引文献

资源附件(0)