基于大语言模型的需求歧义检测方法

高俊涛; 刘芳; 杨溢龙

doi:10.7544/issn1000-1239.202550700

摘要: 在现代软件开发中，需求歧义是导致项目失败、成本超支和质量问题的关键因素，因此，人们对需求歧义的自动化检测进行了广泛的研究。虽然这些方法降低了人工审查的时间成本，但歧义检测解决方案类型不全，难以全面覆盖语言歧义对应的6种歧义类型。传统方法缺乏深层次的语义理解、逻辑关系识别以及引用/指代关系识别能力，这限制了其对语义的处理能力，无法对语用歧义以及语言错误歧义进行检测。为此，提出了一种基于大语言模型的方法（称为LMAdetect），用于自动检测需求歧义。对于给定的需求， LMAdetect根据启发式规则和大语言模型进行分析，将该需求检测分类为对应的歧义类型。然而，大语言模型经常会分类出不同的歧义类型，针对大语言模型的分类结果，LMAdetect利用基于规则的算法，将不同歧义类型分配给不同专家进行检测。最后对这些检测结果根据置信度进行汇总并输出置信度高的分类结果。对6种歧义类型共1 192个数据进行了实验，结果表明，相较传统基于规则的方法，LMAdetect的F1分数提升了34.4个百分点，相较仅使用基于大语言模型的方法，F1分数提升了31.6个百分点，对于语用歧义和语言错误歧义的检测，F1分数分别达到了0.6950和0.8889，展示了其在需求歧义检测方面的优势。

Abstract: In modern software development, requirements ambiguity is a key factor that leads to project failure, cost overruns, and quality issues. Therefore, the automatic detection of requirement ambiguity has been extensively studied. Although these methods reduce the time cost of manual review, their solutions are incomplete for the types of ambiguity detection, and it is difficult to fully cover the six types of ambiguity corresponding to language ambiguity. The traditional method lacks the ability of in-depth semantic understanding, logical relationship recognition and quotation/referential relationship recognition, which limits its semantic processing ability and cannot detect pragmatic ambiguity and linguistic error ambiguity. To this end, we propose a large language model-based method called LMAdetect for automatic detection of requirements ambiguity. For a given requirement, LMAdetect analyzes the requirement according to the heuristic rules and the large language model, and classifies the requirement detection into the corresponding ambiguity type. However, large language models often classify different types of ambiguity, and for the classification results of large language models, LMAdetect uses a rule-based algorithm to assign different types of ambiguity to different experts for detection. Finally, the detection results are summarized according to the confidence level, and the classification results with high confidence are output. Experiments on 1192 data of 6 ambiguity types show that compared with the traditional rule-based method, the F1 score of LMAdetect is increased by 34.4%, and compared with the method based on large language model, the F1 score is increased by 31.6%, and the F1 score of pragmatic ambiguity and linguistic error ambiguity detection reaches 0.6950 and 0.8889, respectively, demonstrating its advantages in requirement ambiguity detection.

基于大语言模型的需求歧义检测方法

Requirement Ambiguity Detection Method Based on Large Language Model