Abstract:
For large-scale and complex software nowadays, the forms of vulnerability code tend to be more diversified. Traditional vulnerability detection methods can not meet the requirements of diverse vulnerabilities because of their high degree of human participation and weak ability of unknown vulnerability detection. In order to improve the detection effect of unknown vulnerability, a large number of machine learning methods have been applied to the field of software vulnerability detection. Due to the high loss of syntax and semantic information in code representation, the false positive rate and false negative rate are high. To solve this issue, a software vulnerability detection method based on code property graph and Bi-GRU is proposed. This method extracts the abstract syntax tree sequence and the control flow graph sequence from the code property graph of the function as the representation method of the function representation. The representation method can reduce the loss of information in the code representation. At the same time, the method selects Bi-GRU to build feature extraction model. It can improve the feature extraction ability of vulnerability code. Experimental results show that, compared with the method represented by abstract syntax tree, this method can improve the accuracy and recall by 35% and 22%. It can improve the vulnerability detection effect of real data set for multiple software source code mixing, and effectively reduce the false positive rate and false negative rate.