Abstract:
Cloned code, also known as duplicated code, is among the bad “code smells”. Procedure extraction can be used to remove clones and make a software system more maintainable. When clone code is extracted out to form procedures, there are some disadvantages that some clone code fragments cannot be extracted directly with the previous syntax-preserving procedure extraction algorithm. To solve these problems, this paper proposes a new semantics-preserving amorphous procedure extraction algorithm. This approach analyzes the program semantic information with program dependence graph and abstract syntax tree. Its characteristic is not syntax-preserving but semantics-preserving, so it can deal with continuous clone code which can not be extracted directly by the traditional method and reduce the need for promoting the unmarked statements, and it need not handle existing jumps that cannot be addressed with the traditional procedure extraction method. The experimental results show that the method can extract the code clones which can't be extracted by the traditional method. It improves the accuracy and adaptability of procedure extraction. We conclude that the proposed algorithm can be a useful contribution to the toolboxes of software managers and developers in the future, which could help them improve code structure and reduce software maintenance and project costs.