高级检索

    融合静态分析与大语言模型的非连续代码重构

    Non-Contiguous Code Refactoring: A Hybrid Approach Integrating Static Analysis and Large Language Models

    • 摘要: 随着大语言模型(LLMs)在软件工程领域的广泛应用,通过其强大的代码理解与生成能力进行自动化代码重构,已成为提升软件质量与开发效率的关键方向。然而,对于由语句交错、重排等导致的非连续代码克隆,LLM在重构时面临着语义上下文分散、关键依赖捕捉困难以及易产生“幻觉”错误等核心挑战。为应对这些挑战,提出了一种融合静态分析与大语言模型的非连续代码克隆重构方法。该方法首先通过结合程序切片与代数分类器,高效精准地识别非连续克隆;然后,通过一种基于上下文信息的重构机会识别算法,为LLM确定最佳重构目标;最后,利用思维链少样本提示策略引导LLM生成高质量的“提取函数”重构建议,并利用蜕变关系验证机制,对生成结果进行语义和结构一致性验证。所提出的重构方法在Junit等真实项目中减少了66%~71%的克隆代码。此外,在开源数据集Google Code Jam和BigCloneBench上的实验表明,所提出的检测方法F1值较现有主流工具提升了2%~18%,在Community Corpus-A重构机会识别基准上,F1值达到了0.415,超越先进工具GEMS 7.5%,提升了软件质量。

       

      Abstract: The widespread adoption of Large Language Models (LLMs) in software engineering has made automated code refactoring, leveraging their powerful code comprehension and generation capabilities, a crucial direction for enhancing software quality and development efficiency. However, when refactoring non-contiguous code clones—those arising from statement interleaving, reordering, and similar transformations—LLMs face core challenges: dispersed semantic context, difficulty in capturing critical dependencies, and susceptibility to “hallucination” errors. To address these challenges, we propose a novel method for non-contiguous code clone refactoring that integrates static analysis with LLMs. Our method first efficiently and accurately identifies non-contiguous clones by combining program slicing with an algebraic classifier. Next, a context-aware refactoring opportunity identification algorithm determines the optimal refactoring targets for the LLM. Finally, a Chain-of-Thought few-shot prompting strategy guides the LLM to generate high-quality “extract method” refactoring suggestions, and a verification mechanism, inspired by metamorphic relations, validates the semantic and structural consistency of the generated results. Experiments on the open-source datasets Google Code Jam and BigCloneBench demonstrate that our proposed refactoring method reduced clone code by 66% to 71% in real-world projects like Junit. Furthermore, our detection method achieved an F1-score 2% to 18% higher than existing mainstream tools. On the Community Corpus-A refactoring opportunity identification benchmark, it reached an F1-score of 0.415, surpassing the state-of-the-art tool GEMS by 7.5%, enhancing software quality.

       

    /

    返回文章
    返回