Hallucination Detection for Large Language Models: A Survey
-
Graphical Abstract
-
Abstract
In recent years, large language models (LLMs) have made significant strides in the field of natural language processing (NLP), demonstrating impressive capabilities in both language understanding and generation. However, despite these advancements, LLMs still face numerous challenges in practical applications. One such issue that has garnered extensive attention from both the academic and industrial communities is the problem of hallucinations. Effectively detecting hallucinations in large language models is a critical challenge for ensuring their reliable, secure, and trustworthy application in downstream tasks such as text generation. This paper provides a comprehensive review of methods for detecting hallucinations in large language models. Firstly, it introduces the concept of large language models and clarifies the definition and classification of hallucinations. It systematically examines the characteristics of LLMs throughout their entire lifecycle, from construction to deployment, and delves into the mechanisms and causes of hallucinations.Secondly, based on practical application requirements and considering factors such as model transparency in different task scenarios, this paper categorizes hallucination detection methods into two types: those for white-box models and those for black-box models. A focused review and in-depth comparison of these methods are provided. This paper then analyzes and summarizes the current mainstream benchmarks for hallucination detection, laying a foundation for future research in this area. Finally, this paper identifies various potential research directions and new challenges in the detection of hallucinations in large language models.
-
-