ISSN 1000-1239 CN 11-1777/TP

Journal of Computer Research and Development ›› 2016, Vol. 53 ›› Issue (3): 531-540.doi: 10.7544/issn1000-1239.2016.20148325

Previous Articles     Next Articles

A Graph Database Based Method for Parsing and Searching Code Structure


  1. (School of Electronic Engineering & Computer Science, Peking University, Beijing 100871) (Key Laboratory of High Confidence Software Technologies(Peking University), Ministry of Education, Beijing 100871)
  • Online:2016-03-01

Abstract: Software reuse is a solution to reduce duplication of effort in software development. When reusing an existing software project, software developers usually need to understand how code elements in it are worked and their correlation, which is called code structure. Software developers usually navigate among source code files to understand code structure. This task could be time-consuming and difficult, since source code of a software project is usually large and complex. Therefore, it is essential to demonstrate code structure in an automatic way that software developers can understand it clearly. For this purpose, this paper introduces a graph database based method for parsing and searching code structure. Code structure is extracted from source code files, and well-organized as a labeled and directed graph in graph database. Software developers input natural language queries. A search mechanism analyzes each of these queries, searches the whole code structure and determines which part of the code structure should be demonstrated. This method is of high extensibility: code elements at different granularity and various relationship types among them can be easily stored into the graph database, and analyzing algorithms for different search purposes can be easily integrated into the search mechanism. A tool is implemented based on this method. Experiment shows that with the help of this tool, the time software developers spending on understanding code structure reduces by 17%, which validates that our method does help improving the efficiency of software reuse. An industrial case study has been showed on how software developers get help from this method.

Key words: code structure, graph database, natural language query, search mechanism, software reuse

CLC Number: