Abstract:
Reusing existing high-quality source codes can improve efficiency of software development and quality of software. At present, code search based on inputoutput queries provided by users is one of the main approaches in the field of code semantic search, but existing approaches are difficult to describe the complete behavior of codes and can only handle a single input type. This paper proposes a code semantic search approach based on the reachability analysis of Petri Nets for matching multiple forms of type. First, the semantic processes of code snippets consisting of the number of data objects and types of data objects in the code corpus are converted into improved Petri Net models. Second, the initial marking and target marking of Petri Net models are constructed according to the number of data objects and types of data objects contained in users’ queries. Matching code snippets is obtained by the analysis of reachable paths in reachability graphs and induced networks of Petri Nets. Analysis and experimental results show that this approach contributes to seeking out desired code snippets by queries that possess multiple forms of inputoutput types provided by users, and compared with traditional approaches, it can significantly improve accuracy and efficiency of code search.