ISSN 1000-1239 CN 11-1777/TP

Journal of Computer Research and Development ›› 2021, Vol. 58 ›› Issue (8): 1668-1685.doi: 10.7544/issn1000-1239.2021.20210297

Special Issue: 2021人工智能前沿进展专题

Previous Articles     Next Articles

Software Vulnerability Detection Method Based on Code Property Graph and Bi-GRU

Xiao Tianming, Guan Jianbo, Jian Songlei, Ren Yi, Zhang Jianfeng, Li Bao   

  1. (College of Computer Science and Technology, National University of Defense Technology, Changsha 410073)
  • Online:2021-08-01
  • Supported by: 
    This work was supported by the National Natural Science Foundation of China (61872444, U19A2060) and the National Key Research and Development Program of China (2018YFB0204301).

Abstract: For large-scale and complex software nowadays, the forms of vulnerability code tend to be more diversified. Traditional vulnerability detection methods can not meet the requirements of diverse vulnerabilities because of their high degree of human participation and weak ability of unknown vulnerability detection. In order to improve the detection effect of unknown vulnerability, a large number of machine learning methods have been applied to the field of software vulnerability detection. Due to the high loss of syntax and semantic information in code representation, the false positive rate and false negative rate are high. To solve this issue, a software vulnerability detection method based on code property graph and Bi-GRU is proposed. This method extracts the abstract syntax tree sequence and the control flow graph sequence from the code property graph of the function as the representation method of the function representation. The representation method can reduce the loss of information in the code representation. At the same time, the method selects Bi-GRU to build feature extraction model. It can improve the feature extraction ability of vulnerability code. Experimental results show that, compared with the method represented by abstract syntax tree, this method can improve the accuracy and recall by 35% and 22%. It can improve the vulnerability detection effect of real data set for multiple software source code mixing, and effectively reduce the false positive rate and false negative rate.

Key words: vulnerability detection, code property graph, code representation, machine learning, Bi-GRU

CLC Number: