Consistent Responses to Paraphrased Questions as Evidence Against Hallucination: A Study on Hallucinations in LLMs(مقاله علمی وزارت علوم)
The increasing adoption of large language models (LLMs) has intensified concerns about hallucinations—outputs that are syntactically fluent but factually incorrect. In this paper, we propose a method for detecting such hallucinations by evaluating the consistency of model responses to paraphrased versions of the same question. The underlying assumption is that if a model produces consistent answers across different paraphrases, the output is more likely to be accurate. To test this method, we developed a system that generates multiple paraphrases of each question and analyzes the consistency of the corresponding responses. Experiments were conducted using two LLMs—GPT-4O and LLaMA 3–70B Chat—on both Persian and English datasets. The method achieved an average accuracy of 99.5% for GPT-4O and 98% for LLaMA 3–70B, indicating the effectiveness of our approach in identifying hallucination-free outputs across languages. Furthermore, by automating the consistency evaluation using an instruction-tuned language model, we enabled scalable and unbiased detection of semantic agreement across paraphrased responses.