Guo Hongjing, Tao Chuanqi, Huang Zhiqiu. Surprise Adequacy-Guided Deep Neural Network Test Inputs Generation[J]. Journal of Computer Research and Development, 2024, 61(4): 1003-1017. DOI: 10.7544/issn1000-1239.202220745
Surprise Adequacy-Guided Deep Neural Network Test Inputs Generation

Funds: This work was supported by the Key Program of the National Natural Science Foundation of China (U224120044), the National Natural Science Foundation of China (62202223), the Natural Science Foundation of Jiangsu Province (BK20220881), the Open Fund Project of the State Key Laboratory for Novel Software Technology (KFKT2021B32), and the Fundamental Research Funds for the Central Universities (NT2022027).
  • Author Bio:

    Guo Hongjing: born in 1996. PhD candidate. Student member of CCF. Her main research interest includes intelligent software testing

    Tao Chuanqi: born in 1984. PhD, associate professor. Senior member of CCF. His main research interests include intelligent software development and quality assurance of intelligent software

    Huang Zhiqiu: born in 1965. PhD, professor. Distinguished member of CCF. His main research interests include software quality assurance, system safety, and formal methods

  • Received Date: August 23, 2022
  • Revised Date: August 14, 2023
  • Available Online: January 24, 2024
  • Due to the complexity and uncertainty of deep neural network (DNN) models, generating test inputs to comprehensively test general and corner case behaviors of DNN models is of great significance for ensuring model quality. Current research primarily focuses on designing coverage criteria and utilizing fuzzing testing technique to generate test inputs, thereby improving test adequacy. However, few studies have taken into consideration the diversity and individual fault-revealing ability of test inputs. Surprise adequacy quantifies the neuron activation differences between a test input and the training set. It is an important metric to measure test adequacy, which has not been leveraged for test input generation. Therefore, we propose a surprise adequacy-guided test input generation approach. Firstly, the approach selects important neurons that contribute more to decision-making. Activation values of these neurons are used as features to improve the surprise adequacy metric. Then, seed test inputs are selected with error-revealing capability based on the improved surprise adequacy measurements. Finally, the approach utilizes the idea of coverage-guided fuzzing testing to jointly optimize the surprise adequacy value of test inputs and the prediction probability differences among classes. The gradient ascent algorithm is adopted to calculate the perturbation and iteratively generate test inputs. Empirical studies on 5 DNN models covering 4 different image datasets demonstrate that the improved surprise adequacy metric effectively captures surprising test inputs and reduces the time cost of the calculation. Concerning test input generation, compared with DeepGini and RobOT, the follow-up test set generated by using the proposed seed input selection strategy exhibits the highest surprise coverage improvement of 5.9% and 15.9%, respectively. Compared with DLFuzz and DeepXplore, the proposed approach achieves the highest surprise coverage improvement of 26.5% and 33.7%, respectively.

